BDR between kerberos enabled environment Enabling Replication Between Clusters with Kerberos Authentication

To enable replication between clusters, additional setup steps are required to ensure that the source and destination clusters can communicate.

Note: If either the source or destination cluster is running Cloudera Manager 4.6 or higher, then both clusters (source and destination) must be running 4.6 or higher. For example, cross-realm authentication does not work if one cluster is running Cloudera Manager 4.5.x and one is running Cloudera Manager 4.6 or higher.

Continue reading:

  • Considerations for Realm Names
  • Configuration
  • Configuring a Peer Relationship

Considerations for Realm Names 

If the source and destination clusters each use Kerberos for authentication, use one of the following configurations to prevent conflicts when running replication jobs:

  • If the clusters do not use the same KDC (Kerberos Key Distribution Center), Cloudera recommends that you use different realm names for each cluster.
  • You can use the same realm name if the clusters use the same KDC or different KDCs that are part of a unified realm, for example where one KDC is the master and the other is a slave KDC.
  • Note:If you have multiple clusters that are used to segregate production and non-production environments, this configuration could result in principals that have equal permissions in both environments. Make sure that permissions are set appropriately for each type of environment.

 

Important: If the source and destination clusters are in the same realm but do not use the same KDC or the KDCs are not part of a unified realm, the replication job will fail.

Configuration 

 

  1. On the hosts in the source & destinationclusters, ensure that the conf file (typically located at /etc/kbr5.conf) on each host has the following information:
    • The kdc information for the sourcecluster’s Kerberos realm. For example:

 

realms

 INTBDA.BIL.COM = {
kdc =<–KDC__MASTER__NODE–>:88
kdc = <–KDC__SLAVE__NODE–>:88
admin_server = b<–KDC__MASTER__NODE–>:749
default_domain = bnet.luxds.net
}
DEVBDA.BIL.COM = {
kdc = <–KDC__MASTER__NODE–>:88
kdc =<–KDC__SLAVE__NODE–>:88
admin_server = <–KDC__MASTER__NODE–>:749
default_domain = bnet.luxds.net
}

  • Domain/host-to-realm mapping for the sourcecluster NameNode hosts. You configure these mappings in the [domain_realm] For example, to map two realms named SRC.MYCO.COM and DEST.MYCO.COM, to the domains of hosts named hostname.src.myco.com and hostname.dest.myco.com, make the following mappings in the krb5.conf file:

[domain_realm]
.src.myco.com = SRC.MYCO.COM
src.myco.com = SRC.MYCO.COM
.dest.myco.com = DEST.MYCO.COM
dest.myco.com = DEST.MYCO.COM

 

BE CAREFUL!!!

But in MYcase, the scenario is completely different. Since the domain names are all ending with bnet.luxds.net, it is not possible to use directly given way. If we use it, then we will face below:

PriviledgedActionException as:hdfs/<–HOST_NAME–>>@DEVBDA.BIL.COM (auth:KERBEROS) cause:java.io.IOException: java.lang.IllegalArgumentException:

Server has invalid Kerberos principal:

hdfs/<–HOST_NAME–>>@INTBDA.BIL.COM, expecting: hdfs/<–HOST_NAME–>>@DEVBDA.BIL.COM

To handle this problem, we have to change the domain_realm as below:

From:

[domain_realm]
.bnet.luxds.net = INTBDA.BIL.COM
bnet.luxds.net = INTBDA.BIL.COM

To(when also adding the other cluster information as well). It shoud contain all host that we have for given environment.

[domain_realm]
NODE_1_CLUSTER_1= INTBDA.BIL.COM
NODE_2_CLUSTER_1= INTBDA.BIL.COM
NODE_3_CLUSTER_1= INTBDA.BIL.COM
NODE_4_CLUSTER_1= INTBDA.BIL.COM
NODE_5_CLUSTER_1= INTBDA.BIL.COM
NODE_6_CLUSTER_1= INTBDA.BIL.COM
NODE_1_CLUSTER_2= DEVBDA.BIL.COM
NODE_2_CLUSTER_2= DEVBDA.BIL.COM
NODE_3_CLUSTER_2= DEVBDA.BIL.COM

 

ihave to arrange domain_realm as above on both cluster.

Trust Creation

addprinc krbtgt/INTBDA.BIL.COM@DEVBDA.BIL.COM to DEV
addprinc krbtgt/DEVBDA.BIL.COM@INTBDA.BIL.COM to INT

 

With these two,  i will be able to reach clusters.

Add dfs.namenode.kerberos.principal.pattern parameter to all clusters

1

  1. On the destinationcluster, use Cloudera Manager to add the realm of the source cluster to the Trusted Kerberos Realms configuration property:
    1. Go to the HDFS service.
    2. Click the Configuration
    3. In the search field type “Trusted Kerberos” to find the Trusted Kerberos Realms
    4. Enter the source cluster realm.
    5. Click Save Changesto commit the changes.

 

Trusted Realm Addition on HDFS

 2  3

 

Configuring a Peer Relationship 

  1. Go to the Peerspage by selecting Administration > Peers. The Peers page displays. If there are no existing peers, you will see only an Add Peer button in addition to a short message. If you have existing peers, they are listed in the Peers list.
  2. Click the Add Peer
  3. In the Add Peer pop-up, provide a name, the URL (including the port) of the Cloudera Manager Server that will act as the source for the data to be replicated, and the login credentials for that server.Important:The role assigned to the login on the source server must be either a User Administratoror a Full Administrator.Cloudera recommends that SSL be used and a warning is shown if the URL scheme is http instead of https. Once both peers have been configured to use SSL/TLS, add the remote source Cloudera Manager’s SSL certificate to the local Cloudera Manager truststore, and vice versa.
  4. Click the Add Peerbutton in the pop-up to create the peer relationship. The peer is added to the Peers list.
  5. To test the connectivity between your Cloudera Manager Server and the peer, select Actions > Test Connectivity.

4

Advertisements

Disable all SSL certificate and go back to the initial state

Disable all SSL certificate and go back to initial state.

1. All steps are done as ‘root’ user.

2. If you have passwordless ssh setup on all nodes you can run dcli on any node, otherwise run the dcli commands on Node 1.

3. When you get to the point of restaring the CM server, do that on Node (The node with CM role,Node 3 by default)

4. Make sure to run the regenerate script on Node 3.

 

1. On Node 1 Back up existing Security directory # dcli -C “cp -r -p /opt/cloudera/security /opt/cloudera/security.BAK_`date +%d%b%Y%H%M%S`”

 

2. Verify there is a backed up file:
# dcli -C ls -ltrd /opt/cloudera/security*

 

3. Executing script for renew default certificates:

*********Perform all steps as ‘root’ user on Node 3*****************

a) Download and copy the regenerate.sh script the node with Cloudera
Manager role, this is Node 3 by default.

You can download it to any directory. For example /tmp.

 

b) Give execute permissions to the script.

# chmod a+x /tmp/regenerate.sh

#########################################################################################################################
#Script should not be used for renewing User’s self-signed certificates. Scripts renews only BDA default certificates. #
#########################################################################################################################

#!/usr/bin/bash -x
export CMUSR=”admin”
if [[ -z $CMPWD ]]; then
export CMPWD=”$1″
if [[ -z $CMPWD ]]; then
echo “INFO: Since no CM password was given nothing can be done”
exit 0
fi
fi
key_loc=`bdacli getinfo cluster_https_keystore_path`
key_password=`bdacli getinfo cluster_https_keystore_password`
trust_password=`bdacli getinfo cluster_https_truststore_password`
trust_loc=`bdacli getinfo cluster_https_truststore_path`
firstnode=(`json-select –jpx=”MAMMOTH_NODE” /opt/oracle/bda/install/state/config.json`)
nodenames=(`json-select –jpx=”RACKS/NODE_NAMES” /opt/oracle/bda/install/state/config.json`)
for node in “${nodenames[@]}”
do
ssh $node “keytool -importkeystore -srckeystore $key_loc -destkeystore /tmp/nodetmp.p12 -deststoretype PKCS12 -srcalias \$HOSTNAME -srcstorepass $key_password -srckeypass $key_password -destkeypass $key_password -deststorepass $key_password”
ssh $node “openssl pkcs12 -in /tmp/nodetmp.p12 -nodes -nocerts -out privateKey.pem -passin pass:$key_password -passout pass:$keystore”
ssh $node ‘openssl req -x509 -new -nodes -key privateKey.pem -sha256 -days 7300 -out newCert.pem -subj “/C=/ST=/L=/O=/CN=${HOSTNAME}”‘
ssh $node “keytool -import -keystore $key_loc -file newCert.pem -alias \$HOSTNAME -storepass $key_password -keypass $key_password”
ssh $node “/usr/java/latest/bin/keytool -exportcert -keystore $key_loc -alias \$HOSTNAME -storepass $key_password -file /opt/cloudera/security/jks/node.cert”
ssh $node “scp /opt/cloudera/security/jks/node.cert root@${firstnode}:/opt/cloudera/security/jks/node_\${HOSTNAME}.cert”
ssh $node “rm -f /tmp/nodetmp.p12; rm -f privateKey.pem; rm -f newCert.pem; rm -f /opt/cloudera/security/x509/node.key; rm -f /opt/cloudera/security/x509/node.cert; rm -f /opt/cloudera/security/x509/node_*pem”
ssh $node “/usr/java/latest/bin/keytool -importkeystore -srckeystore $key_loc -srcstorepass $key_password -srckeypass $key_password -destkeystore /tmp/\${HOSTNAME}-keystore.p12 -deststoretype PKCS12 -srcalias \$HOSTNAME -deststorepass $key_password -destkeypass $key_password -noprompt”
ssh $node “openssl pkcs12 -in /tmp/\${HOSTNAME}-keystore.p12 -passin pass:${key_password} -nokeys -out /opt/cloudera/security/x509/node.cert”
ssh $node “openssl pkcs12 -in /tmp/\${HOSTNAME}-keystore.p12 -passin pass:${key_password} -nocerts -out /opt/cloudera/security/x509/node.key -passout pass:${key_password}”
ssh $node “openssl rsa -in /opt/cloudera/security/x509/node.key -passin pass:${key_password} -out /opt/cloudera/security/x509/node.hue.key”
ssh $node “chown hue /opt/cloudera/security/x509/node.key”
ssh $node “chown hue /opt/cloudera/security/x509/node.cert”
ssh $node “chown hue /opt/cloudera/security/x509/node.hue.key”
done

create=`ls /opt/cloudera/security/jks/ | grep “create”`
ssh $firstnode “rm -f $trust_loc”
ssh $firstnode ” /opt/cloudera/security/jks/./${create} $trust_password”
ssh $firstnode ” /opt/cloudera/security/x509/./create_hue.truststore.pl $trust_password”
ssh $firstnode “dcli -C -f $trust_loc -d $trust_loc”
ssh $firstnode “dcli -C -f /opt/cloudera/security/x509/hue.pem -d /opt/cloudera/security/x509/hue.pem”
rm -f /opt/cloudera/security/jks/cm_key.der
rm -f /opt/cloudera/security/x509/agents.pem
/usr/java/latest/bin/keytool -exportcert -keystore $key_loc -alias $HOSTNAME -storepass $key_password -file /opt/cloudera/security/jks/cm_key.der
openssl x509 -out /opt/cloudera/security/x509/agents.pem -in /opt/cloudera/security/jks/cm_key.der -inform der
scp /opt/cloudera/security/x509/agents.pem root@${firstnode}:/opt/cloudera/security/x509/agents.pem
ssh $firstnode dcli -C -f /opt/cloudera/security/x509/agents.pem -d /opt/cloudera/security/x509/agents.pem

c) Run the script providing the Cloudera Manager admin password as an argument to execute the script:

# ./regenerate.sh <cm_password>

d) Upload the output to the SR for review.

 

4. Once script execution is completed restart Cloudera Manager server
and agents.

a) Stop Cloudera Manager Agents.

# dcli -C service cloudera-scm-agent stop

b) Restart Cloudera Manager server (On Node 3)

# service cloudera-scm-server restart

c) Verify with:
# service cloudera-scm-server status

d) Start Cloudera Manager Agents.
# dcli -C service cloudera-scm-agent start

e) Verify with:
# dcli -C service cloudera-scm-agent status

 

5. Make sure there are no ssl warnings in the Cloudera Manager Server logs.

/var/log/cloudera-scm-server/cloudera-scm-server.log

You can also do:
tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log

and then also upload the
/var/log/cloudera-scm-server/cloudera-scm-server.log to the SR for review.

 

6. In CM:
a) Restart Management services and once healthy.
b) Restart the Cluster services.

 

7. Certificate validity can be checked using keytool or openssl commands.

a) with keytool
# keytool -printcert -file /opt/cloudera/security/x509/agents.pem

b) with openssl:
echo | openssl s_client -connect <fqdn.of.cloudera.manager.webui>:7183 2>/dev/null | openssl x509 -noout -subject -dates

Enable SSL over cluster via SAN(subject alternative name)

Step-by-step guide

On BDA V4.5 and higher using certificates signed by a user’s Certificate Authority for web consoles and for Hadoop network encryption is supported. The support includes use of a client’s own certificates signed by the client’s Certificate Authority instead of the default which is to use self-signed certificates generated on the BDA.

At a high level users update the Mammoth installed truststore with the CA public certificate and the keystores with keys/certificates signed by the customer CA or create a new keystore and truststore on all nodes of the BDA and point Cloudera Manager to that new location.

User provided signed certificates are not allowed for puppet. The use of puppet is internal and puppet is not intended for direct user usage.

The recommendation when using customer provided signed certificates with web consoles, etc. is to use the keystores/truststores provided by Mammoth and to make the minimal changes possible.

The Mammoth installed values can be viewed with:

  • bdacli getinfo cluster_https_keystore_password – Display the password of the keystore used by CM.
  • bdacli getinfo cluster_https_keystore_path – Display the path of the keystore used by CM.
  • bdacli getinfo cluster_https_truststore_password – Display the password of the truststore used by CM.
  • bdacli getinfo cluster_https_truststore_path – Display the path of the truststore used by CM.

Note: This document and all examples use the existing passwords provided in the Mammoth commands above.  There are additional changes required if the passwords are changed.

The steps presented here can be performed in a cluster where Kerberos is or is not installed.

This document requires;

1) that a user provided CA public certificate is available for use on the BDA and

2) that the user will use the BDA node specific Certificate Signing Requests to create BDA node specific signed certificates and copy them to the BDA as specified in the document.

Prerequisites for setting up user provided certificates for web consoles and hadoop network encryption

On the BDA Cluster

  1. Identify the server running the Hue service.

In Cloudera Manager (CM) navigate: hue > Instances

Keep track of the server.

  1. Make sure the cluster is healthy:
  2. a) Verify with:

bdacheckcluster

[root@host_1 cloudera]# bdacheckcluster
INFO: Logging results to /tmp/bdacheckcluster_1522049448/
Enter CM admin user to run dumpcluster
Enter username (admin):
Enter CM admin password to enable check for CM services and hosts
Press ENTER twice to skip CM services and hosts checks
Enter password:
Enter password again:
SUCCESS: Mammoth configuration file is valid.
SUCCESS: hdfs is in good health
SUCCESS: zookeeper is in good health
SUCCESS: yarn is in good health
SUCCESS: oozie is in good health
SUCCESS: hive is in good health
SUCCESS: hue is in good health
SUCCESS: yarn is in good health
SUCCESS: yarn is in good health
SUCCESS: sentry is in good health
SUCCESS: flume is in good health
SUCCESS: client is in good health
SUCCESS: Cluster passed checks on all hadoop services health check
SUCCESS: c39df580-32e2-4671-b2a4-5e47574aba5b is in good health
SUCCESS: 8b04b32a-d763-4817-a8e2-832ba024d52d is in good health
SUCCESS: dc7db617-3b2a-4517-a7a0-775df21c8be1 is in good health
SUCCESS: 4f4784b7-056d-4834-bcd2-a5b050e51a00 is in good health
SUCCESS: e406d9d0-951e-499a-90dc-a97ede1db51e is in good health
SUCCESS: 43e1382a-945a-49cd-8f56-a1670f09a6ca is in good health
SUCCESS: Cluster passed checks on all hosts health check
SUCCESS: All cluster host names are pingable
INFO: Starting cluster host hardware checks
SUCCESS: All cluster hosts pass hardware checks
INFO: Starting cluster host software checks
host_5:
host_6:
host_2:
host_1:
host_4:
host_3:
SUCCESS: All cluster hosts pass software checks
SUCCESS: All ILOM hosts are pingable
SUCCESS: All client interface IPs are pingable
SUCCESS: All admin eth0 interface IPs are pingable
SUCCESS: All private Infiniband interface IPs are pingable
INFO: All PDUs are pingable
SUCCESS: All InfiniBand switches are pingable
SUCCESS: Puppet master is running on host_1-master
SUCCESS: Puppet running on all cluster hosts
SUCCESS: Cloudera SCM server is running on host_3
SUCCESS: Cloudera SCM agent running on all cluster hosts
SUCCESS: Name Node is running on host_1
SUCCESS: Secondary Name Node is running on host_2
SUCCESS: Resource Manager is running on host_3
SUCCESS: Data Nodes running on all cluster hosts
SUCCESS: Node Managers running on all cluster slave hosts
INFO: Skipping Hadoop filesystem test because the hdfs user has no Kerberos ticket.
INFO: Use this command to get a Kerberos ticket for the hdfs user :
INFO: su hdfs -c “kinit hdfs@REALM.NAME”
SUCCESS: MySQL server is running on MySQL master node host_3
SUCCESS: MySQL server is running on MySQL backup node host_2
SUCCESS: Hive Server is running on Hive server node host_4
SUCCESS: Hive metastore server is running on Hive server node host_4
SUCCESS: Dnsmasq server running on all cluster hosts
INFO: Checking local DNS resolve of public hostnames on all cluster hosts
SUCCESS: All cluster hosts resolve public hostnames to private IPs
INFO: Checking local reverse DNS resolve of private IPs on all cluster hosts
SUCCESS: All cluster hosts resolve private IPs to public hostnames
SUCCESS: 2 virtual NICs available on all cluster hosts
SUCCESS: NTP service running on all cluster hosts
SUCCESS: At least one valid NTP server accessible from all cluster servers.
SUCCESS: Max clock drift of 0 seconds is within limits
SUCCESS: Big Data Appliance cluster health checks succeeded
[root@host_1 cloudera]#

b) Make sure services are healthy in CM.

c) Verify the output from the cluster verification checks is successful on Node 1 of the cluster:

mammoth -c

[root@host_1 cloudera]# mammoth -c
INFO: Logging all actions in /opt/oracle/BDAMammoth/bdaconfig/tmp/host_1-20180326093706.log and traces in /opt/oracle/BDAMammoth/bdaconfig/tmp/host_1-20180326093706.trc
INFO: This is the install of the primary rack
INFO: Creating nodelist files…
INFO: Checking if password-less ssh is set up
INFO: Executing checkRoot.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed checkRoot.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Executing checkSSHAllNodes.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed checkSSHAllNodes.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Checking passwordless ssh setup to host_1
host_1<–fdqn–>
INFO: Checking if password-less ssh is set up
INFO: Executing checkRoot.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed checkRoot.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Executing checkSSHAllNodes.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed checkSSHAllNodes.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Reading component versions from /opt/oracle/BDAMammoth/bdaconfig/COMPONENTS
INFO: Getting factory serial numbers
INFO: Executing getserials.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed getserials.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Generated bdaserials on all nodes
SUCCESS: Ran /usr/bin/scp host_1:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_1 and it returned: RC=0
SUCCESS: Ran /usr/bin/scp host_2:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_2 and it returned: RC=0
SUCCESS: Ran /usr/bin/scp host_3:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_3 and it returned: RC=0
SUCCESS: Ran /usr/bin/scp host_4:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_4 and it returned: RC=0
SUCCESS: Ran /usr/bin/scp host_5:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_5 and it returned: RC=0
SUCCESS: Ran /usr/bin/scp host_6:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_6 and it returned: RC=0

INFO: Executing genTestUsers.sh on nodes host_1 #Step -1#
SUCCESS: Executed genTestUsers.sh on nodes host_1 #Step -1#
SUCCESS: Successfully set up Kerberos test users.
INFO: Executing copyKeytab.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed copyKeytab.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Successfully copied keytabs to Mammoth node.
INFO: Executing oracleUser.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed oracleUser.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Doing post-cleanup operations
INFO: Running cluster validation checks and generating install summary
Enter CM admin password to enable check for CM services and hosts
Press ENTER twice to skip CM services and hosts checks
Enter password:
Enter password again:
INFO: Password saved. Doing Cloudera Manager health checks, please wait
Which Cluster Type? Hadoop Cluster
Is Kerberos enabled? Yes
INFO: Running validation tests may take up to 30 minutes depending on the size of the cluster, please wait

HttpFS Test
—————————————————————————————-
SUCCESS: Running httpfs server test succeeded. HTTP/1.1 200 OK
INFO: Test finished in 8 seconds. Details in httpfs_test.out
SUCCESS: HttpFS Test succeeded

Hive Server 2 Test
—————————————————————————————-
INFO: HiveServer2 test – Query database/table info via Beeline
INFO: Test finished in 14 seconds. Details in hiveserver2_test.out
SUCCESS: Hive Server 2 Test succeeded

Spark Test
—————————————————————————————-
INFO: final status: SUCCEEDED
SUCCESS: Pi is roughly 3.1395511395511395
INFO: Test finished in 33 seconds. Details in spark_test.out
SUCCESS: Spark Test succeeded

Spark2 Test
—————————————————————————————-
SUCCESS: Pi is roughly 3.1424471424471423
INFO: Test finished in 33 seconds. Details in spark2_test.out
SUCCESS: Spark2 Test succeeded

Orabalancer Test
—————————————————————————————-
SUCCESS: Oracle Perfect Balance test passed
INFO: Test finished in 77 seconds. Details in balancer_test.out
SUCCESS: Orabalancer Test succeeded

WebHCat Test
—————————————————————————————-
SUCCESS: creating a hcatlog database succeeded. HTTP/1.1 200 OK
SUCCESS: creating a table succeeded. HTTP/1.1 200 OK
SUCCESS: creating a partition succeeded. HTTP/1.1 200 OK
SUCCESS: creating a colum succeeded. HTTP/1.1 200 OK
SUCCESS: creating a property succeeded. HTTP/1.1 200 OK
SUCCESS: describing hcat table succeeded. HTTP/1.1 200 OK
SUCCESS: deleting hcat table succeeded. HTTP/1.1 200 OK
SUCCESS: deleting hcat database succeeded. HTTP/1.1 200 OK
INFO: Test finished in 97 seconds. Details in webhcat_test.out
SUCCESS: WebHCat Test succeeded

Hive Metastore Test
—————————————————————————————-
INFO: Query Hive Metastore Table Passed on node host_1
INFO: Query Hive Metastore Table Passed on node host_2
INFO: Query Hive Metastore Table Passed on node host_3
INFO: Query Hive Metastore Table Passed on node host_4
INFO: Query Hive Metastore Table Passed on node host_5
INFO: Query Hive Metastore Table Passed on node host_6
INFO: Test finished in 107 seconds. Details in metastore_test.out
SUCCESS: Hive Metastore Test succeeded

Teragen-sort-validate Test
—————————————————————————————-
INFO: Test finished in 234 seconds. Details in terasort.out
SUCCESS: Teragen-sort-validate Test succeeded

Oozie Workflow Test
—————————————————————————————-
INFO: Map Reduce Job Status: OK job_1521738483661_0031 SUCCEEDED
INFO: Pig Job Status: OK job_1521738483661_0019 SUCCEEDED
INFO: Hive Job Status: OK job_1521738483661_0023 SUCCEEDED
INFO: Sqoop Job Status: OK job_1521738483661_0026 SUCCEEDED
INFO: Streaming Job Status: OK job_1521738483661_0028 SUCCEEDED
INFO: Test finished in 245 seconds. Details in ooziewf_test.out
SUCCESS: Oozie Workflow Test succeeded

BDA Cluster Check
—————————————————————————————-
host_5:
host_2:
host_6:
host_1:
host_4:
host_3:
INFO: All PDUs are pingable
SUCCESS: Big Data Appliance cluster health checks succeeded
INFO: Test finished in 212 seconds. Details in bdacheckcluster.out
SUCCESS: BDA Cluster Check succeeded
========================================================================================
TEST LOG STATUS TIME(s)
—————————————————————————————-
BDA_Cluster_Check bdacheckcluster.out SUCCESS 212
Teragen-sort-validate_Test terasort.out SUCCESS 234
Oozie_Workflow_Test ooziewf_test.out SUCCESS 245
Hive_Metastore_Test metastore_test.out SUCCESS 107
Hive_Server_2_Test hiveserver2_test.out SUCCESS 14
WebHCat_Test webhcat_test.out SUCCESS 97
HttpFS_Test httpfs_test.out SUCCESS 8
Orabalancer_Test balancer_test.out SUCCESS 77
Spark_Test spark_test.out SUCCESS 33
Spark2_Test spark2_test.out SUCCESS 33
—————————————————————————————-
Total time : 457 sec.
========================================================================================
INFO: Executing oracleUserDestroy.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed oracleUserDestroy.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Executing remTestUsers.sh on nodes host_1 #Step -1#
SUCCESS: Executed remTestUsers.sh on nodes host_1 #Step -1#
SUCCESS: Successfully removed Kerberos test users.
SUCCESS: Ran /bin/cp -pr /opt/oracle/BDAMammoth/bdaconfig/tmp/* /opt/oracle/bda/install/log/clusterchk/summary-20180326093820 and it returned: RC=0
SUCCESS: Ran /bin/rm -rf /opt/oracle/BDAMammoth/bdaconfig/tmp/* and it returned: RC=0
SUCCESS: Ran /bin/cp -prf /tmp/bdacheckcluster* /opt/oracle/bda/install/log/clusterchk/summary-20180326093820 and it returned: RC=0
INFO: Install summary copied to /opt/oracle/bda/install/log/clusterchk/summary-20180326093820
INFO: Time spent in post-cleanup operations was 602 seconds
========================================================================================
SUCCESS: Cluster validation checks were all successful
INFO: Please download the install summary zipfile from /tmp/<–clustername–>-install-summary.zip
========================================================================================
[root@host_1 cloudera]#

3. On Node 1 of the cluster find the existing https keystore password. Create an environment variable for it so it can be used in the steps below.
a) Get the existing https keystore password with: “bdacli getinfo cluster_https_keystore_password”.

Output looks like:

bdacli getinfo cluster_https_keystore_password
Enter the admin user for CM (press enter for admin): admin
Enter the admin password for CM:
3ZnkFUO9rYKFqu0gcusnFwgDUuvlzN3wU0UJit6CuobaVSl67QuyAJUq4WaNSzki

 

b) Create an environment variable for use during the setup:
export PW=3ZnkFUO9rYKFqu0gcusnFwgDUuvlzN3wU0UJit6CuobaVSl67QuyAJUq4WaNSzki
echo $PW
3ZnkFUO9rYKFqu0gcusnFwgDUuvlzN3wU0UJit6CuobaVSl67QuyAJUq4WaNSzki

 

4. On Node 1 of the cluster find the existing https truststore password and https truststore path. Create an environment variables for each so they
can be used in the steps below.

a) Get the existing https truststore password with: “bdacli getinfo cluster_https_truststore_password”.
Output looks like:

bdacli getinfo cluster_https_truststore_password
Enter the admin user for CM (press enter for admin):
Enter the admin password for CM:
FghaVNdTCkMatGgOhZITygmOzqY5IFqBKBhLUsY40IPpezLx89TmQF61CmcBKEoS

b) Create an environment variable for use during the setup:

export TPW=FghaVNdTCkMatGgOhZITygmOzqY5IFqBKBhLUsY40IPpezLx89TmQF61CmcBKEoS
echo $TPW
FghaVNdTCkMatGgOhZITygmOzqY5IFqBKBhLUsY40IPpezLx89TmQF61CmcBKEoS

c) Get the existing https truststore path with: “bdacli getinfo cluster_https_truststore_path”.

Output looks like below for a cluster name of “<–clustername–>”.

# bdacli getinfo cluster_https_truststore_path
Enter the admin user for CM (press enter for admin):
Enter the admin password for CM:
/opt/cloudera/security/jks/<–clustername–>.truststore

d) Create the corresponding environment variables.

Example based on a cluster name of “<–clustername–>”:
export TPATH=/opt/cloudera/security/jks/<–clustername–>.truststore
echo $TPATH
/opt/cloudera/security/jks/<–clustername–>.truststore

Steps to setup user provided certificates for web consoles and hadoop network encryption

Perform all steps as ‘root’ user.  On the BDA cluster perform steps on on Node 1 unless specified otherwise.  This will only be the case for the Hue service updates and for starting/stopping the Cloudera Manager server.

1.Stop Cloudera Manager (CM) services.

a) Log into Cloudera Manager as ‘admin’ user.

b) Stop the cluster services:  Home<cluster_name>dropdown > Stop

c) Stop the Cloudera Management Service: Homemgmtdropdown > Stop

d) Stop the Cloudera Manager agents:

dcli -C service cloudera-scm-agent stop

Verify with:

dcli -C service cloudera-scm-agent status

  1. e) Stop the Cloudera Manager server from Node 3:

service cloudera-scm-server stop

Verify with:

service cloudera-scm-server status

 

  1. Create a new /opt/cloudera/security, backing up the existing directory first (in case of needing to restore anything). Do this on all nodes of the cluster.
  2. Back up the existing /opt/cloudera/security on all cluster nodes. /opt/cloudera/security is the base location for security-related files.

 

dcli -C “mv /opt/cloudera/security /opt/cloudera/security.BAK_`date +%d%b%Y%H%M%S`”

 

  1. b) Create a new /opt/cloudera/security and related sub-directories on all cluster nodes. Where /opt/cloudera/security/jks is the location for the Java-based keystore/ and truststore/ files for use by Cloudera Manager and Java-based cluster services. And /opt/cloudera/security/x509 is the location for for openssl key/, cert/ and cacerts/ files to be used by the Cloudera Manager Agent and Hue.

dcli -C mkdir -p /opt/cloudera/security/jks
dcli -C mkdir -p /opt/cloudera/security/x509

 

  1. Create a staging directory on first server and upload “.pfx”, “.cer” files. For our case, there should be one  “.pfx”(for the private keys which contains all hostname from both client and management) and two  “.cer” files(one should be public keys of given pfx file and the other should be the root certificate).

root@host_1 cloudera]# cd /root/staging/
[root@host_1 staging]# ls -lrt
-rw-r–r– 1 root root 1500 Mar 20 17:42 <–root_cer–>.cer
-rw-r–r– 1 root root 2928 Mar 20 18:19 <–certificate–>.cer
-rw-r–r– 1 root root 3892 Mar 21 15:58 <–certificate–>.pfx
[root@host_1 staging]#

4. Since given pfx files are sending with generic password and alias as “certreq-2df655dc-bb52-4442-9612-bc0622375d2f”,
before starting, I have to find the alias and also change the password.

[root@host_1 ~]# keytool -list -keystore /root/staging/<–certificate–>.pfx
Enter keystore password:
***************** WARNING WARNING WARNING *****************
* The integrity of the information stored in your keystore *
* has NOT been verified! In order to verify its integrity, *
* you must provide your keystore password. *
***************** WARNING WARNING WARNING *****************
Keystore type: JKS
Keystore provider: SUN
Your keystore contains 1 entry
certreq-2df655dc-bb52-4442-9612-bc0622375d2f, Mar 20, 2018, PrivateKeyEntry,
[root@host_1 ~]#

a) To change the alias, you have to run below command and this command also will create “/opt/cloudera/security/jks/node.jks” as well.

— to import
[root@host_1 ~]# keytool -importkeystore -srckeystore /root/staging/<–certificate–>.pfx -srcstoretype pkcs12 -destkeystore
/opt/cloudera/security/jks/node.jks -deststoretype JKS -alias certreq-2df655dc-bb52-4442-9612-bc0622375d2f -destalias <–alias that you want to set–>

 

— to check
keytool -keystore /opt/cloudera/security/jks/node.jks -list
[root@host_1 ~]# keytool -keystore /opt/cloudera/security/jks/node.jks -list
Enter keystore password:
***************** WARNING WARNING WARNING *****************
* The integrity of the information stored in your keystore *
* has NOT been verified! In order to verify its integrity, *
* you must provide your keystore password. *
***************** WARNING WARNING WARNING *****************
Keystore type: JKS
Keystore provider: SUN
Your keystore contains 1 entry
<–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
[root@host_1 ~]#

b) To change the password, you have to run below command:
–change the password for alias <–alias that you want to set–>
keytool -keypasswd -keystore /opt/cloudera/security/jks/node.jks -alias <–alias that you want to set–>

[root@host_1 ~]# keytool -keypasswd -keystore /opt/cloudera/security/jks/node.jks -alias <–alias that you want to set–>
Enter keystore password:
Enter key password for <–alias that you want to set–>
New key password for <–alias that you want to set–>:
Re-enter new key password for <–alias that you want to set–>:
[root@host_1 ~]#

5. As next step, root certificate has to be imported to trust store that we created:

— import <–your companys root cer–> root certificate
keytool -keystore /opt/cloudera/security/jks/node.jks -alias <–alias of root cer–> -import -file
/opt/cloudera/security/jks/<–your company root cer–>.cer -storepass $PW -keypass $PW -noprompt

 

[root@host_1 ~]# keytool -keystore /opt/cloudera/security/jks/node.jks -alias <–alias of root cer–> –
import -file /opt/cloudera/security/jks/<–your company root cer–>.cer -storepass $PW -keypass $PW -noprompt
Certificate was added to keystore

 

–check
[root@host_1 ~]# keytool -keystore /opt/cloudera/security/jks/node.jks -list
Enter keystore password:
***************** WARNING WARNING WARNING *****************
* The integrity of the information stored in your keystore *
* has NOT been verified! In order to verify its integrity, *
* you must provide your keystore password. *
***************** WARNING WARNING WARNING *****************
Keystore type: JKS
Keystore provider: SUN
Your keystore contains 2 entries
<–alias of root cer–>, Mar 20, 2018, trustedCertEntry, —-> this is coming from root certificate
Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
<–alias that you want to set from previous entry–>, Mar 20, 2018, PrivateKeyEntry, —-> this is coming from pfx file
Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
[root@host_1 ~]#

6. After importing all required certificates, we have to align the rest of cluster:

–copy to whole environment
dcli -C -f /opt/cloudera/security/jks/<–your company root cer–>.cer -d /opt/cloudera/security/jks/
dcli -C -f /opt/cloudera/security/jks/node.jks -d /opt/cloudera/security/jks/
dcli -C -f /root/staging/<–certificate–> -d /opt/cloudera/security/jks/

 

–check whole environment
[root@host_1 ~]# dcli -C keytool -keystore /opt/cloudera/security/jks/node.jks -list
…..
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
[root@host_1 ~]#

7. As next step, we have to create truststore as below:

–put root certificate to $TPATH(which is /opt/cloudera/security/jks/<–clustername–>.truststore)

[root@host_1 ~]# keytool -keystore $TPATH -alias <–alias of root cer–> -import -file /opt/cloudera/security/jks/IS4F_ROOT_CA_B64.cer
Enter keystore password:
Re-enter new password:
Owner: CN=<–your company root cer name–>, O=<–your company root cer name–>, C=BE
Issuer: CN=<–your company root cer name–>, O=I<–your company root cer name–>, C=BE
Serial number: 40000000001464238d3e9
Valid from: Wed May 28 12:00:00 CEST 2014 until: Sat May 28 12:00:00 CEST 2039
Certificate fingerprints:
MD5: 84:2C:DD:A9:D4:1A:C2:25:79:60:C7:23:24:44:06:43
SHA1: 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
SHA256: AC:6C:06:2F:05:F3:0E:66:1D:58:6F:1D:D4:14:B5:A8:D8:47:8D:A5:DA:B6:C9:AB:88:91:30:92:0B:82:4B:73
Signature algorithm name: SHA1withRSA
Version: 3
Extensions:
#1: ObjectId: 2.5.29.19 Criticality=true
BasicConstraints:[
CA:true
PathLen:1
]
#2: ObjectId: 2.5.29.32 Criticality=false
CertificatePolicies [
[CertificatePolicyId: [1.3.6.1.4.1.8162.1.3.2.10.1.0]
[PolicyQualifierInfo: [
qualifierID: 1.3.6.1.5.5.7.2.2
qualifier: 0000: 30 36 1A 34 68 74 74 70 73 3A 2F 2F 72 6F 6F 74 06.4https://root
0010: 2E 49 53 34 46 70 6B 69 73 65 72 76 69 63 65 73 .IS4Fpkiservices
0020: 2E 63 6F 6D 2F 49 53 34 46 5F 52 6F 6F 74 43 41 .com/<–alias of root cer–>
0030: 5F 43 50 53 2E 70 64 66 _CPS.pdf
]] ]
]
#3: ObjectId: 2.5.29.15 Criticality=true
KeyUsage [
Key_CertSign
Crl_Sign
]
#4: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: 4B 09 C5 83 63 B9 3D 54 5C 1B 60 A2 28 F9 1A 6D K…c.=T\.`.(..m
0010: 0F F8 1F 1C ….
]
]
Trust this certificate?host_1 [no]: yes
Certificate was added to keystore

[root@ ~]#

–copy truststore to whole environment
[root@host_1 ~]# dcli -C -f $TPATH -d $TPATH

–check
[root@host_1 ~]# dcli -C ls -lrt $TPATH
192.168.11.16: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
192.168.11.17: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
192.168.11.18: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
192.168.11.19: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
192.168.11.20: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
192.168.11.21: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
[root@host_1 ~]#

8. Copy the CA public certificate file to an agents.pem file to be used by Cloudera Manager agents and Hue on each node of the cluster.

-copy certificate as agent.pem

dcli -C -f /opt/cloudera/security/jks/<–alias of root cer–>.cer -d /opt/cloudera/security/x509/agents.pem
[root@host_1 ~]# dcli -C -f /opt/cloudera/security/jks/<–alias of root cer–>.cer -d /opt/cloudera/security/x509/agents.pem

–check

[root@host_1 ~]# dcli -C ls -lrt /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
[root@host_1 ~]#

9. Setup CA public certificate, ca.crt, for the Hue service. All commands in this step are run from the Hue server node.
a) ssh to the Hue node.
b) Verify /opt/cloudera/security/jks/<–your company root cer name–>.cer:
–for HUE go to HUE host

[root@hue_host ~]# ls -l /opt/cloudera/security/jks/<–your company root cer name–>.cer
-rw-r–r– 1 root root 1500 Mar 20 17:53 /opt/cloudera/security/jks/<–your company root cer name–>.cer

c) Create the link to /opt/cloudera/security/x509/hue.pem.
–create sof-link for hue
[root@hue_host ~]# ln /opt/cloudera/security/jks/<–your company root cer name–>.cer /opt/cloudera/security/x509/hue.pem

–check
[root@hue_host ~]# ls -l /opt/cloudera/security/x509/hue.pem
-rw-r–r– 2 root root 1500 Mar 20 17:53 /opt/cloudera/security/x509/hue.pem
[
root@hue_host ~]#
d) Export the existing https keystore password collected in the “Prerequisite” Section on the Hue server node. For example:

# export PW=3ZnkFUO9rYKFqu0gcusnFwgDUuvlzN3wU0UJit6CuobaVSl67QuyAJUq4WaNSzki  

e) Run the keytool commands to import the key and create required files for HUE:
–import the key <–alias that you want to set–> for HUE

/usr/java/latest/bin/keytool -importkeystore -srckeystore /opt/cloudera/security/jks/node.jks -srcstorepass $PW -srckeypass $PW -destkeystore
/tmp/hue_host-keystore.p12 -deststoretype PKCS12 -srcalias <–alias that you want to set–> -deststorepass $PW -destkeypass $PW -noprompt

–Run the “openssl pkcs12” command:

openssl pkcs12 -in /tmp/${HOSTNAME}-keystore.p12 -passin pass:${PW} -nocerts -out /opt/cloudera/security/x509/node.key -passout pass:${PW}
[root@hue_host ~]# openssl pkcs12 -in /tmp/hue_host-keystore.p12 -passin pass:${PW} -nocerts -out /opt/cloudera/security/x509/node.key -passout pass:${PW}
MAC verified OK
[root@hue_host ~]#

–Run the “openssl rsa” command

openssl rsa -in /opt/cloudera/security/x509/node.key -passin pass:${PW} -out /opt/cloudera/security/x509/node.hue.key
[root@hue_host ~]# openssl rsa -in /opt/cloudera/security/x509/node.key -passin pass:${PW} -out /opt/cloudera/security/x509/node.hue.key
writing RSA key
[root@hue_host ~]#

–Run the “openssl pkcs12” command like:

openssl pkcs12 -in /tmp/${HOSTNAME}-keystore.p12 -passin pass:${PW} -nokeys -out /opt/cloudera/security/x509/node.cert
[root@hue_host ~]# openssl pkcs12 -in /tmp/${HOSTNAME}-keystore.p12 -passin pass:${PW} -nokeys -out /opt/cloudera/security/x509/node.cert
MAC verified OK
[root@hue_host~]#

f) Change the owner on the files to be “hue”

 

[root@hue_host ~]# cd /opt/cloudera/security/x509/
[root@hue_host x509]# ls -l
total 20
-rw-r–r– 1 root root 1500 Mar 20 17:58 agents.pem
-rw-r–r– 2 root root 1500 Mar 20 17:53 hue.pem
-rw-r–r– 1 root root 3122 Mar 20 18:08 node.cert
-rw-r–r– 1 root root 1675 Mar 20 18:07 node.hue.key
-rw-r–r– 1 root root 1977 Mar 20 18:06 node.key

[root@hue_host x509]#
chown hue /opt/cloudera/security/x509/node.key
chown hue /opt/cloudera/security/x509/node.cert
chown hue /opt/cloudera/security/x509/node.hue.key

[root@hue_host x509]# ls -l
total 20
-rw-r–r– 1 root root 1500 Mar 20 17:58 agents.pem
-rw-r–r– 2 root root 1500 Mar 20 17:53 hue.pem
-rw-r–r– 1 hue root 3122 Mar 20 18:08 node.cert
-rw-r–r– 1 hue root 1675 Mar 20 18:07 node.hue.key
-rw-r–r– 1 hue root 1977 Mar 20 18:06 node.key
[root@hue_host x509]#

  1. start everything…… and check cloudera server manager logs from cloudera server host(which is the 3rd server for all environments) to see if you have any kind of ssl errors or not.
  2. a) Start the Cloudera Manager server from Node 3:

#  service cloudera-scm-server start

Verify:

# service cloudera-scm-server status

  1. b) From Node 1, start the agents:

# dcli -C service cloudera-scm-agent start

Verify:

# service cloudera-scm-agent status

  1. c) Log into CM as “admin” user. Note this is like a fresh first time login.
  2. d) Start the mgmt service: Homemgmtdropdown > Start
  3. e) Start the cluster:Home<cluster-name> > dropdown > Start

 

  1. Make sure the cluster is healthy.
  2. a) Verify with:

# bdacheckcluster

  1. b) Make sure services are healthy in CM.
  2. c) Verify the output from the cluster verification checks is successful:

# cd /opt/oracle/BDAMammoth

# ./mammoth -c

 

IF YOU FACE ANY KIND OF PROBLEM, JUST ROLL-BACK WHAT YOU HAVE DONE VIA REPLACING YOUR DIRECTORY WITH THE ONE THAT YOU BACKED UP IN THE BEGINNING.

Deploying a Custom Patch Parcel Using Cloudera Manager

Following are the steps to install and deploy a patched parcel received from Cloudera Support.

Download the Patched Parcel
1. Download the .parcel and the associated manifest.json

  • Both files are necessary
  • Do not edit the manifest.json

2. Create a local server to host the .parcel and manifest.json so that Cloudera Manager sees and ingests them to its own Parcel Repository.

3. The Cloudera Documentation details a Local Parcel Repository:

Method 1: Creating a Temporary Repository:

Follow the Instructions under Cloudera Documentation: Creating a Temporary Remote Repository
Follow the Instructions under Cloudera Documentation: Configuring the Cloudera Manager Server to Use the Parcel URL

To create a disposable local repository to deploy the parcel once: It is convenient to perform this on the same host that runs Cloudera Manager, or a Gateway role. In this example, python SimpleHTTPServer is used from a specific directory (select as desired).

  1. Download the patched .parcel and manifest.json as provided in a secure link from Cloudera Support
  2. Copy the .parcel and manifest.json to a desired location on the server.

Example:
“/tmp/parcel”

This is the directory from which the Python SimpleHTTPServer serves out files:

$ mkdir /tmp/parcel
$ cp /home/user/Downloads/patchparcel/CDH-4.5.0.p234.parcel /tmp/parcel/
$ cp /home/user/Downloads/patchparcel/manifest.json /tmp/parcel/
  1. Determine a port that the system is not listening on (for example, port 8900). This is passed into the SimpleHTTPServer command below.
  2. Start a Python SimpleHTTPServer to serve these two files from the newly created directory:
$ cd /tmp/parcel
$ python -m SimpleHTTPServer 8900

Serving HTTP on 0.0.0.0 port 8900 ...
  1. Confirm this hosted parcel directory is reachable by going to http://&lt;server>:<port> using a browser. The links for the .parcel and the manifest.json display as shown in the screenshot below.

Configuring Cloudera Manager to use the Repository:

Add the server as a Remote Parcel Repository in Cloudera Manager:
  1. In the Cloudera Manager Admin Console, go to Administration > Settings > Parcels
  2. Under Remote Parcel Repository URLs, click the to add a new URL
  3. Enter http://&lt;server>:<port> in the new location
  1. Click Save Changes

Download, Distribute and/or Activate the Patch Parcel:

  1. Use the published instructions in Using Parcels to set the Cloudera Manager.
  2. Check for New Parcels to cause Cloudera Manager to find this patch parcel and note that it is available in the temporary SimpleHTTPServer.

Method 2: Using /opt/cloudera/parcel-repo directory on the Cloudera Manager Server

1. Copy the .patch file under /opt/cloudera/parcel-repo directory on the Cloudera Manager Server

2. $ sha1sum /opt/cloudera/parcel-repo/CDH-patch-file.parcel | cut -d ‘ ‘ -f 1 > /opt/cloudera/parcel-repo/CDH-patch-file.parcel.sha

3. $ chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/CDH-patch-file.parcel /opt/cloudera/parcel-repo/CDH-patch-file.parcel.sha

4. In Cloudera Manager check for Patch Parcel to appear

Note: replace CDH-patch-file.parcel with the actual filename of the .patch file

 

Steps to Configure a Single-Node YARN Cluster

The following type of installation is often referred to as “pseudo-distributed” because it mimics some of the functionality of a distributed Hadoop cluster. A single machine is, of course, not practical for any production use, nor is it parallel. A small-scale Hadoop installation can provide a simple method for learning Hadoop basics, however.

The recommended minimal installation hardware is a dual-core processor with 2 GB of RAM and 2 GB of available hard drive space. The system will need a recent Linux distribution with Java installed (e.g., Red Hat Enterprise Linux or rebuilds, Fedora, Suse Linux Enterprise, OpenSuse, Ubuntu). Red Hat Enterprise Linux 6.3 is used for this installation example. A bash shell environment is also assumed. The first step is to download Apache Hadoop.

Step 1: Download Apache Hadoop

Download the latest distribution from the Hadoop website ( http://hadoop.apache. org/). For example, as root do the following:

# cd /root
# wget http://mirrors.ibiblio.org/apache/hadoop/common/hadoop-2.2.0/hadoop- ➥2.2.0.tar.gz

Next create and extract the package in /opt/yarn:

# mkdir –p /opt/yarn

# cd /opt/yarn
# tar xvzf /root/hadoop-2.2.0.tar.gz

Step 2: Set JAVA_HOME

For Hadoop 2, the recommended version of Java can be found at http://wiki.apache. org/hadoop/HadoopJavaVersions. In general, a Java Development Kit 1.6 (or greater) should work. For this install, we will use Open Java 1.6.0_24, which is part of Red Hat Enterprise Linux 6.3. Make sure you have a working Java JDK installed; in this case, it is the Java-1.6.0-openjdk RPM. To include JAVA_HOME for all bash users (other shells must be set in a similar fashion), make an entry in /etc/profile.d as follows:

# echo “export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/” > /etc/ ➥profile.d/java.sh

To make sure JAVA_HOME is defined for this session, source the new script: # source /etc/profile.d/java.sh

Step 3: Create Users and Groups

It is best to run the various daemons with separate accounts. Three accounts (yarn, hdfs, mapred) in the group hadoop can be created as follows:

# groupadd hadoop

# useradd -g hadoop yarn

# useradd -g hadoop hdfs

# useradd -g hadoop mapred

Step 4: Make Data and Log Directories

Hadoop needs various data and log directories with various permissions. Enter the following lines to create these directories:

# mkdir -p /var/data/hadoop/hdfs/nn

# mkdir -p /var/data/hadoop/hdfs/snn

# mkdir -p /var/data/hadoop/hdfs/dn

# chown hdfs:hadoop /var/data/hadoop/hdfs –R

# mkdir -p /var/log/hadoop/yarn

# chown yarn:hadoop /var/log/hadoop/yarn -R

Next, move to the YARN installation root and create the log directory and set the owner and group as follows:

# cd /opt/yarn/hadoop-2.2.0

# mkdir logs
# chmod g+w logs

# chown yarn:hadoop . -R

Step 5: Configure core-site.xml

From the base of the Hadoop installation path (e.g., /opt/yarn/hadoop-2.2.0), edit the etc/hadoop/core-site.xml file. The original installed file will have no entries other than the<configuration></configuration>tags. Two properties need to be set. The first is the fs.default.name property, which sets the host and request port name for the NameNode (metadata server for HDFS). The second is hadoop.http.staticuser.user, which will set the default user name to hdfs. Copy the following lines to the Hadoop etc/hadoop/core-site.xml file and remove the original empty <configuration> </configuration> tags.

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

<property>

<name>hadoop.http.staticuser.user</name> <value>hdfs</value>

</property>

</configuration>

Step 6: Configure hdfs-site.xml

From the base of the Hadoop installation path, edit the etc/hadoop/hdfs-site.xml file. In the single-node pseudo-distributed mode, we don’t need or want the HDFS to replicate file blocks. By default, HDFS keeps three copies of each file in the file system for redundancy. There is no need for replication on a single machine; thus the value of dfs.replication will be set to 1.

In hdfs-site.xml, we specify the NameNode, Secondary NameNode, and Data- Node data directories that we created in Step 4. These are the directories used by the various components of HDFS to store data. Copy the following lines into Hadoop etc/hadoop/hdfs-site.xml and remove the original empty <configuration> </configuration> tags.

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value> </property> <property>

<name>dfs.namenode.name.dir</name>

<value>file:/var/data/hadoop/hdfs/nn</value> </property>

<property>

<name>fs.checkpoint.dir</name>

<value>file:/var/data/hadoop/hdfs/snn</value>

</property>

<property>

<name>fs.checkpoint.edits.dir</name> <value>file:/var/data/hadoop/hdfs/snn</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/var/data/hadoop/hdfs/dn</value>

</property>

</configuration>

 

Step 7: Configure mapred-site.xml

From the base of the Hadoop installation, edit the etc/hadoop/mapred-site.xml file. A new configuration option for Hadoop 2 is the capability to specify a framework name for MapReduce, setting the mapreduce.framework.name property. In this install, we will use the value of “yarn” to tell MapReduce that it will run as a YARN appli- cation. First, copy the template file to the mapred-site.xml.

# cp mapred-site.xml.template mapred-site.xml

Next, copy the following lines into Hadoop etc/hadoop/mapred-site.xml file and

remove the original empty <configuration> </configuration> tags.

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

</configuration>

 

Step 8: Configure yarn-site.xml

From the base of the Hadoop installation, edit the etc/hadoop/yarn-site.xml file. The yarn.nodemanager.aux-services property tells NodeManagers that there will be an auxiliary service called mapreduce.shuffle that they need to implement. After we tell the NodeManagers to implement that service, we give it a class name as the means to implement that service. This particular configuration tells MapReduce how to do its shuffle. Because NodeManagers won’t shuffle data for a non-MapReduce job by default, we need to configure such a service for MapReduce. Copy the following lines to the Hadoop etc/hadoop/yarn-site.xml file and remove the original empty <configuration> </configuration> tags.

<configuration>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value> </property>

</configuration>

Step 9: Modify Java Heap Sizes

The Hadoop installation uses several environment variables that determine the heap sizes for each Hadoop process. These are defined in the etc/hadoop/*-env.sh files used by Hadoop. The default for most of the processes is a 1 GB heap size; because we’re running on a workstation that will probably have limited resources compared to a standard server, however, we need to adjust the heap size settings. The values that follow are adequate for a small workstation or server.

Edit the etc/hadoop/hadoop-env.sh file to reflect the following (don’t forget to remove the “#” at the beginning of the line):

HADOOP_HEAPSIZE=”500″

HADOOP_NAMENODE_INIT_HEAPSIZE=”500″

Next, edit mapred-env.sh to ref lect the following:

HADOOP_JOB_HISTORYSERVER_HEAPSIZE=250

Finally, edit yarn-env.sh to ref lect the following:

JAVA_HEAP_MAX=-Xmx500m

The following line will need to be added to yarn-env.sh:

YARN_HEAPSIZE=500

Step 10: Format HDFS

For the HDFS NameNode to start, it needs to initialize the directory where it will hold its data. The NameNode service tracks all the metadata for the file sys- tem. The format process will use the value assigned to dfs.namenode.name.dir in etc/hadoop/hdfs-site.xml earlier (i.e., /var/data/hadoop/hdfs/nn). Format- ting destroys everything in the directory and sets up a new file system. Format the NameNode directory as the HDFS superuser, which is typically the “hdfs” user account.

From the base of the Hadoop distribution, change directories to the “bin” direc- tory and execute the following commands:

# su – hdfs

$ cd /opt/yarn/hadoop-2.2.0/bin

$ ./hdfs namenode -format

If the command worked, you should see the following near the end of a long list of messages:

INFO common.Storage: Storage directory /var/data/hadoop/hdfs/nn has been ➥successfully formatted.

 

Step 11: Start the HDFS Services

Once formatting is successful, the HDFS services must be started. There is one ser- vice for the NameNode (metadata server), a single DataNode (where the actual data
is stored), and the SecondaryNameNode (checkpoint data for the NameNode). The Hadoop distribution includes scripts that set up these commands as well as name other values such as PID directories, log directories, and other standard process configura- tions. From the bin directory in Step 10, execute the following as user hdfs:

$ cd ../sbin

$ ./hadoop-daemon.sh start namenode

The command should show the following:

starting namenode, logging to /opt/yarn/hadoop-2.2.0/logs/hadoop-hdfs-namenode- ➥limulus.out

The secondarynamenode and datanode services can be started in the same way:

$ ./hadoop-daemon.sh start secondarynamenode
starting secondarynamenode, logging to /opt/yarn/hadoop-2.2.0/logs/hadoop-hdfs- ➥secondarynamenode-limulus.out

$ ./hadoop-daemon.sh start datanode
starting datanode, logging to /opt/yarn/hadoop-2.2.0/logs/hadoop-hdfs-datanode- ➥limulus.out

If the daemon started successfully, you should see responses that will point to the log file. (Note that the actual log file is appended with “.log,” not “.out.”). As a sanity check, issue a jps command to confirm that all the services are running. The actual PID (Java Process ID) values will be different than shown in this listing:

$ jps

15140 SecondaryNameNode

15015 NameNode

15335 Jps

15214 DataNode

If the process did not start, it may be helpful to inspect the log files. For instance, examine the log file for the NameNode. (Note that the path is taken from the preced- ing command.)

vi /opt/yarn/hadoop-2.2.0/logs/hadoop-hdfs-namenode-limulus.log

All Hadoop services can be stopped using the hadoop-daemon.sh script. For example, to stop the datanode service, enter the following (as user hdfs in the /opt/yarn/hadoop-2.2.0/sbin directory):

$ ./hadoop-daemon.sh stop datanode

The same can be done for the NameNode and SecondaryNameNode.

Step 12: Start YARN Services

As with HDFS services, the YARN services need to be started. One ResourceManager and one NodeManager must be started as user yarn (exiting from user hdfs first):

$ exit

logout

# su – yarn

$ cd /opt/yarn/hadoop-2.2.0/sbin

$ ./yarn-daemon.sh start resourcemanager

starting resourcemanager, logging to /opt/yarn/hadoop-2.2.0/logs/yarn-yarn- ➥resourcemanager-limulus.out

$ ./yarn-daemon.sh start nodemanager
starting nodemanager, logging to /opt/yarn/hadoop-2.2.0/logs/yarn-yarn- ➥nodemanager-limulus.out

As when the HDFS daemons were started in Step 1, the status of the running dae- mons is sent to their respective log files. To check whether the services are running, issue a jps command. The following shows all the services necessary to run YARN on a single server:

$ jps

15933 Jps

15567 ResourceManager

15785 NodeManager

If there are missing services, check the log file for the specific service. Similar to the case with HDFS services, the services can be stopped by issuing a stop argument to the daemon script:

./yarn-daemon.sh stop nodemanager

Step 13: Verify the Running Services Using the Web Interface

Both HDFS and the YARN ResourceManager have a web interface. These interfaces are a convenient way to browse many of the aspects of your Hadoop installation. To monitor HDFS, enter the following (or use your favorite web browser):

$ firefox http://localhost:50070

Connecting to port 50070 will bring up a web interface similar to Figure 1.
1 .Web interface for the ResourceManager can be viewed by entering the following:

$ firefox http://localhost:8088

A webpage similar to that shown in Figure 1.2 will be displayed.

Ekran Resmi 2016-03-31 01.25.42

Figure 1.1 Webpage for HDFS file system

 

Ekran Resmi 2016-03-31 01.25.34

Figure 1.2 Webpage for YARN ResourceManager