Archive

Archive for the ‘big data’ Category

Disable all SSL certificate and go back to the initial state

April 23, 2018 Leave a comment

Disable all SSL certificate and go back to initial state.

1. All steps are done as ‘root’ user.

2. If you have passwordless ssh setup on all nodes you can run dcli on any node, otherwise run the dcli commands on Node 1.

3. When you get to the point of restaring the CM server, do that on Node (The node with CM role,Node 3 by default)

4. Make sure to run the regenerate script on Node 3.

 

1. On Node 1 Back up existing Security directory # dcli -C “cp -r -p /opt/cloudera/security /opt/cloudera/security.BAK_`date +%d%b%Y%H%M%S`”

 

2. Verify there is a backed up file:
# dcli -C ls -ltrd /opt/cloudera/security*

 

3. Executing script for renew default certificates:

*********Perform all steps as ‘root’ user on Node 3*****************

a) Download and copy the regenerate.sh script the node with Cloudera
Manager role, this is Node 3 by default.

You can download it to any directory. For example /tmp.

 

b) Give execute permissions to the script.

# chmod a+x /tmp/regenerate.sh

#########################################################################################################################
#Script should not be used for renewing User’s self-signed certificates. Scripts renews only BDA default certificates. #
#########################################################################################################################

#!/usr/bin/bash -x
export CMUSR=”admin”
if [[ -z $CMPWD ]]; then
export CMPWD=”$1″
if [[ -z $CMPWD ]]; then
echo “INFO: Since no CM password was given nothing can be done”
exit 0
fi
fi
key_loc=`bdacli getinfo cluster_https_keystore_path`
key_password=`bdacli getinfo cluster_https_keystore_password`
trust_password=`bdacli getinfo cluster_https_truststore_password`
trust_loc=`bdacli getinfo cluster_https_truststore_path`
firstnode=(`json-select –jpx=”MAMMOTH_NODE” /opt/oracle/bda/install/state/config.json`)
nodenames=(`json-select –jpx=”RACKS/NODE_NAMES” /opt/oracle/bda/install/state/config.json`)
for node in “${nodenames[@]}”
do
ssh $node “keytool -importkeystore -srckeystore $key_loc -destkeystore /tmp/nodetmp.p12 -deststoretype PKCS12 -srcalias \$HOSTNAME -srcstorepass $key_password -srckeypass $key_password -destkeypass $key_password -deststorepass $key_password”
ssh $node “openssl pkcs12 -in /tmp/nodetmp.p12 -nodes -nocerts -out privateKey.pem -passin pass:$key_password -passout pass:$keystore”
ssh $node ‘openssl req -x509 -new -nodes -key privateKey.pem -sha256 -days 7300 -out newCert.pem -subj “/C=/ST=/L=/O=/CN=${HOSTNAME}”‘
ssh $node “keytool -import -keystore $key_loc -file newCert.pem -alias \$HOSTNAME -storepass $key_password -keypass $key_password”
ssh $node “/usr/java/latest/bin/keytool -exportcert -keystore $key_loc -alias \$HOSTNAME -storepass $key_password -file /opt/cloudera/security/jks/node.cert”
ssh $node “scp /opt/cloudera/security/jks/node.cert root@${firstnode}:/opt/cloudera/security/jks/node_\${HOSTNAME}.cert”
ssh $node “rm -f /tmp/nodetmp.p12; rm -f privateKey.pem; rm -f newCert.pem; rm -f /opt/cloudera/security/x509/node.key; rm -f /opt/cloudera/security/x509/node.cert; rm -f /opt/cloudera/security/x509/node_*pem”
ssh $node “/usr/java/latest/bin/keytool -importkeystore -srckeystore $key_loc -srcstorepass $key_password -srckeypass $key_password -destkeystore /tmp/\${HOSTNAME}-keystore.p12 -deststoretype PKCS12 -srcalias \$HOSTNAME -deststorepass $key_password -destkeypass $key_password -noprompt”
ssh $node “openssl pkcs12 -in /tmp/\${HOSTNAME}-keystore.p12 -passin pass:${key_password} -nokeys -out /opt/cloudera/security/x509/node.cert”
ssh $node “openssl pkcs12 -in /tmp/\${HOSTNAME}-keystore.p12 -passin pass:${key_password} -nocerts -out /opt/cloudera/security/x509/node.key -passout pass:${key_password}”
ssh $node “openssl rsa -in /opt/cloudera/security/x509/node.key -passin pass:${key_password} -out /opt/cloudera/security/x509/node.hue.key”
ssh $node “chown hue /opt/cloudera/security/x509/node.key”
ssh $node “chown hue /opt/cloudera/security/x509/node.cert”
ssh $node “chown hue /opt/cloudera/security/x509/node.hue.key”
done

create=`ls /opt/cloudera/security/jks/ | grep “create”`
ssh $firstnode “rm -f $trust_loc”
ssh $firstnode ” /opt/cloudera/security/jks/./${create} $trust_password”
ssh $firstnode ” /opt/cloudera/security/x509/./create_hue.truststore.pl $trust_password”
ssh $firstnode “dcli -C -f $trust_loc -d $trust_loc”
ssh $firstnode “dcli -C -f /opt/cloudera/security/x509/hue.pem -d /opt/cloudera/security/x509/hue.pem”
rm -f /opt/cloudera/security/jks/cm_key.der
rm -f /opt/cloudera/security/x509/agents.pem
/usr/java/latest/bin/keytool -exportcert -keystore $key_loc -alias $HOSTNAME -storepass $key_password -file /opt/cloudera/security/jks/cm_key.der
openssl x509 -out /opt/cloudera/security/x509/agents.pem -in /opt/cloudera/security/jks/cm_key.der -inform der
scp /opt/cloudera/security/x509/agents.pem root@${firstnode}:/opt/cloudera/security/x509/agents.pem
ssh $firstnode dcli -C -f /opt/cloudera/security/x509/agents.pem -d /opt/cloudera/security/x509/agents.pem

c) Run the script providing the Cloudera Manager admin password as an argument to execute the script:

# ./regenerate.sh <cm_password>

d) Upload the output to the SR for review.

 

4. Once script execution is completed restart Cloudera Manager server
and agents.

a) Stop Cloudera Manager Agents.

# dcli -C service cloudera-scm-agent stop

b) Restart Cloudera Manager server (On Node 3)

# service cloudera-scm-server restart

c) Verify with:
# service cloudera-scm-server status

d) Start Cloudera Manager Agents.
# dcli -C service cloudera-scm-agent start

e) Verify with:
# dcli -C service cloudera-scm-agent status

 

5. Make sure there are no ssl warnings in the Cloudera Manager Server logs.

/var/log/cloudera-scm-server/cloudera-scm-server.log

You can also do:
tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log

and then also upload the
/var/log/cloudera-scm-server/cloudera-scm-server.log to the SR for review.

 

6. In CM:
a) Restart Management services and once healthy.
b) Restart the Cluster services.

 

7. Certificate validity can be checked using keytool or openssl commands.

a) with keytool
# keytool -printcert -file /opt/cloudera/security/x509/agents.pem

b) with openssl:
echo | openssl s_client -connect <fqdn.of.cloudera.manager.webui>:7183 2>/dev/null | openssl x509 -noout -subject -dates

Advertisements

Enable SSL over cluster via SAN(subject alternative name)

April 23, 2018 Leave a comment

Step-by-step guide

On BDA V4.5 and higher using certificates signed by a user’s Certificate Authority for web consoles and for Hadoop network encryption is supported. The support includes use of a client’s own certificates signed by the client’s Certificate Authority instead of the default which is to use self-signed certificates generated on the BDA.

At a high level users update the Mammoth installed truststore with the CA public certificate and the keystores with keys/certificates signed by the customer CA or create a new keystore and truststore on all nodes of the BDA and point Cloudera Manager to that new location.

User provided signed certificates are not allowed for puppet. The use of puppet is internal and puppet is not intended for direct user usage.

The recommendation when using customer provided signed certificates with web consoles, etc. is to use the keystores/truststores provided by Mammoth and to make the minimal changes possible.

The Mammoth installed values can be viewed with:

  • bdacli getinfo cluster_https_keystore_password – Display the password of the keystore used by CM.
  • bdacli getinfo cluster_https_keystore_path – Display the path of the keystore used by CM.
  • bdacli getinfo cluster_https_truststore_password – Display the password of the truststore used by CM.
  • bdacli getinfo cluster_https_truststore_path – Display the path of the truststore used by CM.

Note: This document and all examples use the existing passwords provided in the Mammoth commands above.  There are additional changes required if the passwords are changed.

The steps presented here can be performed in a cluster where Kerberos is or is not installed.

This document requires;

1) that a user provided CA public certificate is available for use on the BDA and

2) that the user will use the BDA node specific Certificate Signing Requests to create BDA node specific signed certificates and copy them to the BDA as specified in the document.

Prerequisites for setting up user provided certificates for web consoles and hadoop network encryption

On the BDA Cluster

  1. Identify the server running the Hue service.

In Cloudera Manager (CM) navigate: hue > Instances

Keep track of the server.

  1. Make sure the cluster is healthy:
  2. a) Verify with:

bdacheckcluster

[root@host_1 cloudera]# bdacheckcluster
INFO: Logging results to /tmp/bdacheckcluster_1522049448/
Enter CM admin user to run dumpcluster
Enter username (admin):
Enter CM admin password to enable check for CM services and hosts
Press ENTER twice to skip CM services and hosts checks
Enter password:
Enter password again:
SUCCESS: Mammoth configuration file is valid.
SUCCESS: hdfs is in good health
SUCCESS: zookeeper is in good health
SUCCESS: yarn is in good health
SUCCESS: oozie is in good health
SUCCESS: hive is in good health
SUCCESS: hue is in good health
SUCCESS: yarn is in good health
SUCCESS: yarn is in good health
SUCCESS: sentry is in good health
SUCCESS: flume is in good health
SUCCESS: client is in good health
SUCCESS: Cluster passed checks on all hadoop services health check
SUCCESS: c39df580-32e2-4671-b2a4-5e47574aba5b is in good health
SUCCESS: 8b04b32a-d763-4817-a8e2-832ba024d52d is in good health
SUCCESS: dc7db617-3b2a-4517-a7a0-775df21c8be1 is in good health
SUCCESS: 4f4784b7-056d-4834-bcd2-a5b050e51a00 is in good health
SUCCESS: e406d9d0-951e-499a-90dc-a97ede1db51e is in good health
SUCCESS: 43e1382a-945a-49cd-8f56-a1670f09a6ca is in good health
SUCCESS: Cluster passed checks on all hosts health check
SUCCESS: All cluster host names are pingable
INFO: Starting cluster host hardware checks
SUCCESS: All cluster hosts pass hardware checks
INFO: Starting cluster host software checks
host_5:
host_6:
host_2:
host_1:
host_4:
host_3:
SUCCESS: All cluster hosts pass software checks
SUCCESS: All ILOM hosts are pingable
SUCCESS: All client interface IPs are pingable
SUCCESS: All admin eth0 interface IPs are pingable
SUCCESS: All private Infiniband interface IPs are pingable
INFO: All PDUs are pingable
SUCCESS: All InfiniBand switches are pingable
SUCCESS: Puppet master is running on host_1-master
SUCCESS: Puppet running on all cluster hosts
SUCCESS: Cloudera SCM server is running on host_3
SUCCESS: Cloudera SCM agent running on all cluster hosts
SUCCESS: Name Node is running on host_1
SUCCESS: Secondary Name Node is running on host_2
SUCCESS: Resource Manager is running on host_3
SUCCESS: Data Nodes running on all cluster hosts
SUCCESS: Node Managers running on all cluster slave hosts
INFO: Skipping Hadoop filesystem test because the hdfs user has no Kerberos ticket.
INFO: Use this command to get a Kerberos ticket for the hdfs user :
INFO: su hdfs -c “kinit hdfs@REALM.NAME”
SUCCESS: MySQL server is running on MySQL master node host_3
SUCCESS: MySQL server is running on MySQL backup node host_2
SUCCESS: Hive Server is running on Hive server node host_4
SUCCESS: Hive metastore server is running on Hive server node host_4
SUCCESS: Dnsmasq server running on all cluster hosts
INFO: Checking local DNS resolve of public hostnames on all cluster hosts
SUCCESS: All cluster hosts resolve public hostnames to private IPs
INFO: Checking local reverse DNS resolve of private IPs on all cluster hosts
SUCCESS: All cluster hosts resolve private IPs to public hostnames
SUCCESS: 2 virtual NICs available on all cluster hosts
SUCCESS: NTP service running on all cluster hosts
SUCCESS: At least one valid NTP server accessible from all cluster servers.
SUCCESS: Max clock drift of 0 seconds is within limits
SUCCESS: Big Data Appliance cluster health checks succeeded
[root@host_1 cloudera]#

b) Make sure services are healthy in CM.

c) Verify the output from the cluster verification checks is successful on Node 1 of the cluster:

mammoth -c

[root@host_1 cloudera]# mammoth -c
INFO: Logging all actions in /opt/oracle/BDAMammoth/bdaconfig/tmp/host_1-20180326093706.log and traces in /opt/oracle/BDAMammoth/bdaconfig/tmp/host_1-20180326093706.trc
INFO: This is the install of the primary rack
INFO: Creating nodelist files…
INFO: Checking if password-less ssh is set up
INFO: Executing checkRoot.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed checkRoot.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Executing checkSSHAllNodes.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed checkSSHAllNodes.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Checking passwordless ssh setup to host_1
host_1<–fdqn–>
INFO: Checking if password-less ssh is set up
INFO: Executing checkRoot.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed checkRoot.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Executing checkSSHAllNodes.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed checkSSHAllNodes.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Reading component versions from /opt/oracle/BDAMammoth/bdaconfig/COMPONENTS
INFO: Getting factory serial numbers
INFO: Executing getserials.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed getserials.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Generated bdaserials on all nodes
SUCCESS: Ran /usr/bin/scp host_1:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_1 and it returned: RC=0
SUCCESS: Ran /usr/bin/scp host_2:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_2 and it returned: RC=0
SUCCESS: Ran /usr/bin/scp host_3:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_3 and it returned: RC=0
SUCCESS: Ran /usr/bin/scp host_4:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_4 and it returned: RC=0
SUCCESS: Ran /usr/bin/scp host_5:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_5 and it returned: RC=0
SUCCESS: Ran /usr/bin/scp host_6:/opt/oracle/bda/factory_serial_numbers /opt/oracle/bda/install/log/factory_serial_numbers-host_6 and it returned: RC=0

INFO: Executing genTestUsers.sh on nodes host_1 #Step -1#
SUCCESS: Executed genTestUsers.sh on nodes host_1 #Step -1#
SUCCESS: Successfully set up Kerberos test users.
INFO: Executing copyKeytab.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed copyKeytab.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Successfully copied keytabs to Mammoth node.
INFO: Executing oracleUser.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed oracleUser.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Doing post-cleanup operations
INFO: Running cluster validation checks and generating install summary
Enter CM admin password to enable check for CM services and hosts
Press ENTER twice to skip CM services and hosts checks
Enter password:
Enter password again:
INFO: Password saved. Doing Cloudera Manager health checks, please wait
Which Cluster Type? Hadoop Cluster
Is Kerberos enabled? Yes
INFO: Running validation tests may take up to 30 minutes depending on the size of the cluster, please wait

HttpFS Test
—————————————————————————————-
SUCCESS: Running httpfs server test succeeded. HTTP/1.1 200 OK
INFO: Test finished in 8 seconds. Details in httpfs_test.out
SUCCESS: HttpFS Test succeeded

Hive Server 2 Test
—————————————————————————————-
INFO: HiveServer2 test – Query database/table info via Beeline
INFO: Test finished in 14 seconds. Details in hiveserver2_test.out
SUCCESS: Hive Server 2 Test succeeded

Spark Test
—————————————————————————————-
INFO: final status: SUCCEEDED
SUCCESS: Pi is roughly 3.1395511395511395
INFO: Test finished in 33 seconds. Details in spark_test.out
SUCCESS: Spark Test succeeded

Spark2 Test
—————————————————————————————-
SUCCESS: Pi is roughly 3.1424471424471423
INFO: Test finished in 33 seconds. Details in spark2_test.out
SUCCESS: Spark2 Test succeeded

Orabalancer Test
—————————————————————————————-
SUCCESS: Oracle Perfect Balance test passed
INFO: Test finished in 77 seconds. Details in balancer_test.out
SUCCESS: Orabalancer Test succeeded

WebHCat Test
—————————————————————————————-
SUCCESS: creating a hcatlog database succeeded. HTTP/1.1 200 OK
SUCCESS: creating a table succeeded. HTTP/1.1 200 OK
SUCCESS: creating a partition succeeded. HTTP/1.1 200 OK
SUCCESS: creating a colum succeeded. HTTP/1.1 200 OK
SUCCESS: creating a property succeeded. HTTP/1.1 200 OK
SUCCESS: describing hcat table succeeded. HTTP/1.1 200 OK
SUCCESS: deleting hcat table succeeded. HTTP/1.1 200 OK
SUCCESS: deleting hcat database succeeded. HTTP/1.1 200 OK
INFO: Test finished in 97 seconds. Details in webhcat_test.out
SUCCESS: WebHCat Test succeeded

Hive Metastore Test
—————————————————————————————-
INFO: Query Hive Metastore Table Passed on node host_1
INFO: Query Hive Metastore Table Passed on node host_2
INFO: Query Hive Metastore Table Passed on node host_3
INFO: Query Hive Metastore Table Passed on node host_4
INFO: Query Hive Metastore Table Passed on node host_5
INFO: Query Hive Metastore Table Passed on node host_6
INFO: Test finished in 107 seconds. Details in metastore_test.out
SUCCESS: Hive Metastore Test succeeded

Teragen-sort-validate Test
—————————————————————————————-
INFO: Test finished in 234 seconds. Details in terasort.out
SUCCESS: Teragen-sort-validate Test succeeded

Oozie Workflow Test
—————————————————————————————-
INFO: Map Reduce Job Status: OK job_1521738483661_0031 SUCCEEDED
INFO: Pig Job Status: OK job_1521738483661_0019 SUCCEEDED
INFO: Hive Job Status: OK job_1521738483661_0023 SUCCEEDED
INFO: Sqoop Job Status: OK job_1521738483661_0026 SUCCEEDED
INFO: Streaming Job Status: OK job_1521738483661_0028 SUCCEEDED
INFO: Test finished in 245 seconds. Details in ooziewf_test.out
SUCCESS: Oozie Workflow Test succeeded

BDA Cluster Check
—————————————————————————————-
host_5:
host_2:
host_6:
host_1:
host_4:
host_3:
INFO: All PDUs are pingable
SUCCESS: Big Data Appliance cluster health checks succeeded
INFO: Test finished in 212 seconds. Details in bdacheckcluster.out
SUCCESS: BDA Cluster Check succeeded
========================================================================================
TEST LOG STATUS TIME(s)
—————————————————————————————-
BDA_Cluster_Check bdacheckcluster.out SUCCESS 212
Teragen-sort-validate_Test terasort.out SUCCESS 234
Oozie_Workflow_Test ooziewf_test.out SUCCESS 245
Hive_Metastore_Test metastore_test.out SUCCESS 107
Hive_Server_2_Test hiveserver2_test.out SUCCESS 14
WebHCat_Test webhcat_test.out SUCCESS 97
HttpFS_Test httpfs_test.out SUCCESS 8
Orabalancer_Test balancer_test.out SUCCESS 77
Spark_Test spark_test.out SUCCESS 33
Spark2_Test spark2_test.out SUCCESS 33
—————————————————————————————-
Total time : 457 sec.
========================================================================================
INFO: Executing oracleUserDestroy.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
SUCCESS: Executed oracleUserDestroy.sh on nodes /opt/oracle/BDAMammoth/bdaconfig/tmp/all_nodes #Step -1#
INFO: Executing remTestUsers.sh on nodes host_1 #Step -1#
SUCCESS: Executed remTestUsers.sh on nodes host_1 #Step -1#
SUCCESS: Successfully removed Kerberos test users.
SUCCESS: Ran /bin/cp -pr /opt/oracle/BDAMammoth/bdaconfig/tmp/* /opt/oracle/bda/install/log/clusterchk/summary-20180326093820 and it returned: RC=0
SUCCESS: Ran /bin/rm -rf /opt/oracle/BDAMammoth/bdaconfig/tmp/* and it returned: RC=0
SUCCESS: Ran /bin/cp -prf /tmp/bdacheckcluster* /opt/oracle/bda/install/log/clusterchk/summary-20180326093820 and it returned: RC=0
INFO: Install summary copied to /opt/oracle/bda/install/log/clusterchk/summary-20180326093820
INFO: Time spent in post-cleanup operations was 602 seconds
========================================================================================
SUCCESS: Cluster validation checks were all successful
INFO: Please download the install summary zipfile from /tmp/<–clustername–>-install-summary.zip
========================================================================================
[root@host_1 cloudera]#

3. On Node 1 of the cluster find the existing https keystore password. Create an environment variable for it so it can be used in the steps below.
a) Get the existing https keystore password with: “bdacli getinfo cluster_https_keystore_password”.

Output looks like:

bdacli getinfo cluster_https_keystore_password
Enter the admin user for CM (press enter for admin): admin
Enter the admin password for CM:
3ZnkFUO9rYKFqu0gcusnFwgDUuvlzN3wU0UJit6CuobaVSl67QuyAJUq4WaNSzki

 

b) Create an environment variable for use during the setup:
export PW=3ZnkFUO9rYKFqu0gcusnFwgDUuvlzN3wU0UJit6CuobaVSl67QuyAJUq4WaNSzki
echo $PW
3ZnkFUO9rYKFqu0gcusnFwgDUuvlzN3wU0UJit6CuobaVSl67QuyAJUq4WaNSzki

 

4. On Node 1 of the cluster find the existing https truststore password and https truststore path. Create an environment variables for each so they
can be used in the steps below.

a) Get the existing https truststore password with: “bdacli getinfo cluster_https_truststore_password”.
Output looks like:

bdacli getinfo cluster_https_truststore_password
Enter the admin user for CM (press enter for admin):
Enter the admin password for CM:
FghaVNdTCkMatGgOhZITygmOzqY5IFqBKBhLUsY40IPpezLx89TmQF61CmcBKEoS

b) Create an environment variable for use during the setup:

export TPW=FghaVNdTCkMatGgOhZITygmOzqY5IFqBKBhLUsY40IPpezLx89TmQF61CmcBKEoS
echo $TPW
FghaVNdTCkMatGgOhZITygmOzqY5IFqBKBhLUsY40IPpezLx89TmQF61CmcBKEoS

c) Get the existing https truststore path with: “bdacli getinfo cluster_https_truststore_path”.

Output looks like below for a cluster name of “<–clustername–>”.

# bdacli getinfo cluster_https_truststore_path
Enter the admin user for CM (press enter for admin):
Enter the admin password for CM:
/opt/cloudera/security/jks/<–clustername–>.truststore

d) Create the corresponding environment variables.

Example based on a cluster name of “<–clustername–>”:
export TPATH=/opt/cloudera/security/jks/<–clustername–>.truststore
echo $TPATH
/opt/cloudera/security/jks/<–clustername–>.truststore

Steps to setup user provided certificates for web consoles and hadoop network encryption

Perform all steps as ‘root’ user.  On the BDA cluster perform steps on on Node 1 unless specified otherwise.  This will only be the case for the Hue service updates and for starting/stopping the Cloudera Manager server.

1.Stop Cloudera Manager (CM) services.

a) Log into Cloudera Manager as ‘admin’ user.

b) Stop the cluster services:  Home<cluster_name>dropdown > Stop

c) Stop the Cloudera Management Service: Homemgmtdropdown > Stop

d) Stop the Cloudera Manager agents:

dcli -C service cloudera-scm-agent stop

Verify with:

dcli -C service cloudera-scm-agent status

  1. e) Stop the Cloudera Manager server from Node 3:

service cloudera-scm-server stop

Verify with:

service cloudera-scm-server status

 

  1. Create a new /opt/cloudera/security, backing up the existing directory first (in case of needing to restore anything). Do this on all nodes of the cluster.
  2. Back up the existing /opt/cloudera/security on all cluster nodes. /opt/cloudera/security is the base location for security-related files.

 

dcli -C “mv /opt/cloudera/security /opt/cloudera/security.BAK_`date +%d%b%Y%H%M%S`”

 

  1. b) Create a new /opt/cloudera/security and related sub-directories on all cluster nodes. Where /opt/cloudera/security/jks is the location for the Java-based keystore/ and truststore/ files for use by Cloudera Manager and Java-based cluster services. And /opt/cloudera/security/x509 is the location for for openssl key/, cert/ and cacerts/ files to be used by the Cloudera Manager Agent and Hue.

dcli -C mkdir -p /opt/cloudera/security/jks
dcli -C mkdir -p /opt/cloudera/security/x509

 

  1. Create a staging directory on first server and upload “.pfx”, “.cer” files. For our case, there should be one  “.pfx”(for the private keys which contains all hostname from both client and management) and two  “.cer” files(one should be public keys of given pfx file and the other should be the root certificate).

root@host_1 cloudera]# cd /root/staging/
[root@host_1 staging]# ls -lrt
-rw-r–r– 1 root root 1500 Mar 20 17:42 <–root_cer–>.cer
-rw-r–r– 1 root root 2928 Mar 20 18:19 <–certificate–>.cer
-rw-r–r– 1 root root 3892 Mar 21 15:58 <–certificate–>.pfx
[root@host_1 staging]#

4. Since given pfx files are sending with generic password and alias as “certreq-2df655dc-bb52-4442-9612-bc0622375d2f”,
before starting, I have to find the alias and also change the password.

[root@host_1 ~]# keytool -list -keystore /root/staging/<–certificate–>.pfx
Enter keystore password:
***************** WARNING WARNING WARNING *****************
* The integrity of the information stored in your keystore *
* has NOT been verified! In order to verify its integrity, *
* you must provide your keystore password. *
***************** WARNING WARNING WARNING *****************
Keystore type: JKS
Keystore provider: SUN
Your keystore contains 1 entry
certreq-2df655dc-bb52-4442-9612-bc0622375d2f, Mar 20, 2018, PrivateKeyEntry,
[root@host_1 ~]#

a) To change the alias, you have to run below command and this command also will create “/opt/cloudera/security/jks/node.jks” as well.

— to import
[root@host_1 ~]# keytool -importkeystore -srckeystore /root/staging/<–certificate–>.pfx -srcstoretype pkcs12 -destkeystore
/opt/cloudera/security/jks/node.jks -deststoretype JKS -alias certreq-2df655dc-bb52-4442-9612-bc0622375d2f -destalias <–alias that you want to set–>

 

— to check
keytool -keystore /opt/cloudera/security/jks/node.jks -list
[root@host_1 ~]# keytool -keystore /opt/cloudera/security/jks/node.jks -list
Enter keystore password:
***************** WARNING WARNING WARNING *****************
* The integrity of the information stored in your keystore *
* has NOT been verified! In order to verify its integrity, *
* you must provide your keystore password. *
***************** WARNING WARNING WARNING *****************
Keystore type: JKS
Keystore provider: SUN
Your keystore contains 1 entry
<–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
[root@host_1 ~]#

b) To change the password, you have to run below command:
–change the password for alias <–alias that you want to set–>
keytool -keypasswd -keystore /opt/cloudera/security/jks/node.jks -alias <–alias that you want to set–>

[root@host_1 ~]# keytool -keypasswd -keystore /opt/cloudera/security/jks/node.jks -alias <–alias that you want to set–>
Enter keystore password:
Enter key password for <–alias that you want to set–>
New key password for <–alias that you want to set–>:
Re-enter new key password for <–alias that you want to set–>:
[root@host_1 ~]#

5. As next step, root certificate has to be imported to trust store that we created:

— import <–your companys root cer–> root certificate
keytool -keystore /opt/cloudera/security/jks/node.jks -alias <–alias of root cer–> -import -file
/opt/cloudera/security/jks/<–your company root cer–>.cer -storepass $PW -keypass $PW -noprompt

 

[root@host_1 ~]# keytool -keystore /opt/cloudera/security/jks/node.jks -alias <–alias of root cer–> –
import -file /opt/cloudera/security/jks/<–your company root cer–>.cer -storepass $PW -keypass $PW -noprompt
Certificate was added to keystore

 

–check
[root@host_1 ~]# keytool -keystore /opt/cloudera/security/jks/node.jks -list
Enter keystore password:
***************** WARNING WARNING WARNING *****************
* The integrity of the information stored in your keystore *
* has NOT been verified! In order to verify its integrity, *
* you must provide your keystore password. *
***************** WARNING WARNING WARNING *****************
Keystore type: JKS
Keystore provider: SUN
Your keystore contains 2 entries
<–alias of root cer–>, Mar 20, 2018, trustedCertEntry, —-> this is coming from root certificate
Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
<–alias that you want to set from previous entry–>, Mar 20, 2018, PrivateKeyEntry, —-> this is coming from pfx file
Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
[root@host_1 ~]#

6. After importing all required certificates, we have to align the rest of cluster:

–copy to whole environment
dcli -C -f /opt/cloudera/security/jks/<–your company root cer–>.cer -d /opt/cloudera/security/jks/
dcli -C -f /opt/cloudera/security/jks/node.jks -d /opt/cloudera/security/jks/
dcli -C -f /root/staging/<–certificate–> -d /opt/cloudera/security/jks/

 

–check whole environment
[root@host_1 ~]# dcli -C keytool -keystore /opt/cloudera/security/jks/node.jks -list
…..
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
xx.xx.xx.xx: Keystore type: JKS
xx.xx.xx.xx: Keystore provider: SUN
xx.xx.xx.xx:
xx.xx.xx.xx: Your keystore contains 2 entries
xx.xx.xx.xx:
xx.xx.xx.xx: <–alias of root cer–>, Mar 20, 2018, trustedCertEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
xx.xx.xx.xx: <–alias that you want to set–>, Mar 20, 2018, PrivateKeyEntry,
xx.xx.xx.xx: Certificate fingerprint (SHA1): 02:24:BC:E1:9E:29:AE:C0:7B:F9:B3:8A:86:14:45:92:55:E6:03:DB
[root@host_1 ~]#

7. As next step, we have to create truststore as below:

–put root certificate to $TPATH(which is /opt/cloudera/security/jks/<–clustername–>.truststore)

[root@host_1 ~]# keytool -keystore $TPATH -alias <–alias of root cer–> -import -file /opt/cloudera/security/jks/IS4F_ROOT_CA_B64.cer
Enter keystore password:
Re-enter new password:
Owner: CN=<–your company root cer name–>, O=<–your company root cer name–>, C=BE
Issuer: CN=<–your company root cer name–>, O=I<–your company root cer name–>, C=BE
Serial number: 40000000001464238d3e9
Valid from: Wed May 28 12:00:00 CEST 2014 until: Sat May 28 12:00:00 CEST 2039
Certificate fingerprints:
MD5: 84:2C:DD:A9:D4:1A:C2:25:79:60:C7:23:24:44:06:43
SHA1: 66:73:3B:4D:90:0C:F1:B1:EA:D4:76:33:F2:74:37:07:8E:3A:E8:01
SHA256: AC:6C:06:2F:05:F3:0E:66:1D:58:6F:1D:D4:14:B5:A8:D8:47:8D:A5:DA:B6:C9:AB:88:91:30:92:0B:82:4B:73
Signature algorithm name: SHA1withRSA
Version: 3
Extensions:
#1: ObjectId: 2.5.29.19 Criticality=true
BasicConstraints:[
CA:true
PathLen:1
]
#2: ObjectId: 2.5.29.32 Criticality=false
CertificatePolicies [
[CertificatePolicyId: [1.3.6.1.4.1.8162.1.3.2.10.1.0]
[PolicyQualifierInfo: [
qualifierID: 1.3.6.1.5.5.7.2.2
qualifier: 0000: 30 36 1A 34 68 74 74 70 73 3A 2F 2F 72 6F 6F 74 06.4https://root
0010: 2E 49 53 34 46 70 6B 69 73 65 72 76 69 63 65 73 .IS4Fpkiservices
0020: 2E 63 6F 6D 2F 49 53 34 46 5F 52 6F 6F 74 43 41 .com/<–alias of root cer–>
0030: 5F 43 50 53 2E 70 64 66 _CPS.pdf
]] ]
]
#3: ObjectId: 2.5.29.15 Criticality=true
KeyUsage [
Key_CertSign
Crl_Sign
]
#4: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: 4B 09 C5 83 63 B9 3D 54 5C 1B 60 A2 28 F9 1A 6D K…c.=T\.`.(..m
0010: 0F F8 1F 1C ….
]
]
Trust this certificate?host_1 [no]: yes
Certificate was added to keystore

[root@ ~]#

–copy truststore to whole environment
[root@host_1 ~]# dcli -C -f $TPATH -d $TPATH

–check
[root@host_1 ~]# dcli -C ls -lrt $TPATH
192.168.11.16: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
192.168.11.17: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
192.168.11.18: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
192.168.11.19: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
192.168.11.20: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
192.168.11.21: -rw-r–r– 1 root root 1113 Mar 20 17:57 /opt/cloudera/security/jks/<–clustername–>.truststore
[root@host_1 ~]#

8. Copy the CA public certificate file to an agents.pem file to be used by Cloudera Manager agents and Hue on each node of the cluster.

-copy certificate as agent.pem

dcli -C -f /opt/cloudera/security/jks/<–alias of root cer–>.cer -d /opt/cloudera/security/x509/agents.pem
[root@host_1 ~]# dcli -C -f /opt/cloudera/security/jks/<–alias of root cer–>.cer -d /opt/cloudera/security/x509/agents.pem

–check

[root@host_1 ~]# dcli -C ls -lrt /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
xx.xx.xx.xx: -rw-r–r– 1 root root 1500 Mar 20 17:58 /opt/cloudera/security/x509/agents.pem
[root@host_1 ~]#

9. Setup CA public certificate, ca.crt, for the Hue service. All commands in this step are run from the Hue server node.
a) ssh to the Hue node.
b) Verify /opt/cloudera/security/jks/<–your company root cer name–>.cer:
–for HUE go to HUE host

[root@hue_host ~]# ls -l /opt/cloudera/security/jks/<–your company root cer name–>.cer
-rw-r–r– 1 root root 1500 Mar 20 17:53 /opt/cloudera/security/jks/<–your company root cer name–>.cer

c) Create the link to /opt/cloudera/security/x509/hue.pem.
–create sof-link for hue
[root@hue_host ~]# ln /opt/cloudera/security/jks/<–your company root cer name–>.cer /opt/cloudera/security/x509/hue.pem

–check
[root@hue_host ~]# ls -l /opt/cloudera/security/x509/hue.pem
-rw-r–r– 2 root root 1500 Mar 20 17:53 /opt/cloudera/security/x509/hue.pem
[
root@hue_host ~]#
d) Export the existing https keystore password collected in the “Prerequisite” Section on the Hue server node. For example:

# export PW=3ZnkFUO9rYKFqu0gcusnFwgDUuvlzN3wU0UJit6CuobaVSl67QuyAJUq4WaNSzki  

e) Run the keytool commands to import the key and create required files for HUE:
–import the key <–alias that you want to set–> for HUE

/usr/java/latest/bin/keytool -importkeystore -srckeystore /opt/cloudera/security/jks/node.jks -srcstorepass $PW -srckeypass $PW -destkeystore
/tmp/hue_host-keystore.p12 -deststoretype PKCS12 -srcalias <–alias that you want to set–> -deststorepass $PW -destkeypass $PW -noprompt

–Run the “openssl pkcs12” command:

openssl pkcs12 -in /tmp/${HOSTNAME}-keystore.p12 -passin pass:${PW} -nocerts -out /opt/cloudera/security/x509/node.key -passout pass:${PW}
[root@hue_host ~]# openssl pkcs12 -in /tmp/hue_host-keystore.p12 -passin pass:${PW} -nocerts -out /opt/cloudera/security/x509/node.key -passout pass:${PW}
MAC verified OK
[root@hue_host ~]#

–Run the “openssl rsa” command

openssl rsa -in /opt/cloudera/security/x509/node.key -passin pass:${PW} -out /opt/cloudera/security/x509/node.hue.key
[root@hue_host ~]# openssl rsa -in /opt/cloudera/security/x509/node.key -passin pass:${PW} -out /opt/cloudera/security/x509/node.hue.key
writing RSA key
[root@hue_host ~]#

–Run the “openssl pkcs12” command like:

openssl pkcs12 -in /tmp/${HOSTNAME}-keystore.p12 -passin pass:${PW} -nokeys -out /opt/cloudera/security/x509/node.cert
[root@hue_host ~]# openssl pkcs12 -in /tmp/${HOSTNAME}-keystore.p12 -passin pass:${PW} -nokeys -out /opt/cloudera/security/x509/node.cert
MAC verified OK
[root@hue_host~]#

f) Change the owner on the files to be “hue”

 

[root@hue_host ~]# cd /opt/cloudera/security/x509/
[root@hue_host x509]# ls -l
total 20
-rw-r–r– 1 root root 1500 Mar 20 17:58 agents.pem
-rw-r–r– 2 root root 1500 Mar 20 17:53 hue.pem
-rw-r–r– 1 root root 3122 Mar 20 18:08 node.cert
-rw-r–r– 1 root root 1675 Mar 20 18:07 node.hue.key
-rw-r–r– 1 root root 1977 Mar 20 18:06 node.key

[root@hue_host x509]#
chown hue /opt/cloudera/security/x509/node.key
chown hue /opt/cloudera/security/x509/node.cert
chown hue /opt/cloudera/security/x509/node.hue.key

[root@hue_host x509]# ls -l
total 20
-rw-r–r– 1 root root 1500 Mar 20 17:58 agents.pem
-rw-r–r– 2 root root 1500 Mar 20 17:53 hue.pem
-rw-r–r– 1 hue root 3122 Mar 20 18:08 node.cert
-rw-r–r– 1 hue root 1675 Mar 20 18:07 node.hue.key
-rw-r–r– 1 hue root 1977 Mar 20 18:06 node.key
[root@hue_host x509]#

  1. start everything…… and check cloudera server manager logs from cloudera server host(which is the 3rd server for all environments) to see if you have any kind of ssl errors or not.
  2. a) Start the Cloudera Manager server from Node 3:

#  service cloudera-scm-server start

Verify:

# service cloudera-scm-server status

  1. b) From Node 1, start the agents:

# dcli -C service cloudera-scm-agent start

Verify:

# service cloudera-scm-agent status

  1. c) Log into CM as “admin” user. Note this is like a fresh first time login.
  2. d) Start the mgmt service: Homemgmtdropdown > Start
  3. e) Start the cluster:Home<cluster-name> > dropdown > Start

 

  1. Make sure the cluster is healthy.
  2. a) Verify with:

# bdacheckcluster

  1. b) Make sure services are healthy in CM.
  2. c) Verify the output from the cluster verification checks is successful:

# cd /opt/oracle/BDAMammoth

# ./mammoth -c

 

IF YOU FACE ANY KIND OF PROBLEM, JUST ROLL-BACK WHAT YOU HAVE DONE VIA REPLACING YOUR DIRECTORY WITH THE ONE THAT YOU BACKED UP IN THE BEGINNING.

Steps to Configure a Single-Node YARN Cluster

March 30, 2016 Leave a comment

The following type of installation is often referred to as “pseudo-distributed” because it mimics some of the functionality of a distributed Hadoop cluster. A single machine is, of course, not practical for any production use, nor is it parallel. A small-scale Hadoop installation can provide a simple method for learning Hadoop basics, however.

The recommended minimal installation hardware is a dual-core processor with 2 GB of RAM and 2 GB of available hard drive space. The system will need a recent Linux distribution with Java installed (e.g., Red Hat Enterprise Linux or rebuilds, Fedora, Suse Linux Enterprise, OpenSuse, Ubuntu). Red Hat Enterprise Linux 6.3 is used for this installation example. A bash shell environment is also assumed. The first step is to download Apache Hadoop.

Step 1: Download Apache Hadoop

Download the latest distribution from the Hadoop website ( http://hadoop.apache. org/). For example, as root do the following:

# cd /root
# wget http://mirrors.ibiblio.org/apache/hadoop/common/hadoop-2.2.0/hadoop- ➥2.2.0.tar.gz

Next create and extract the package in /opt/yarn:

# mkdir –p /opt/yarn

# cd /opt/yarn
# tar xvzf /root/hadoop-2.2.0.tar.gz

Step 2: Set JAVA_HOME

For Hadoop 2, the recommended version of Java can be found at http://wiki.apache. org/hadoop/HadoopJavaVersions. In general, a Java Development Kit 1.6 (or greater) should work. For this install, we will use Open Java 1.6.0_24, which is part of Red Hat Enterprise Linux 6.3. Make sure you have a working Java JDK installed; in this case, it is the Java-1.6.0-openjdk RPM. To include JAVA_HOME for all bash users (other shells must be set in a similar fashion), make an entry in /etc/profile.d as follows:

# echo “export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/” > /etc/ ➥profile.d/java.sh

To make sure JAVA_HOME is defined for this session, source the new script: # source /etc/profile.d/java.sh

Step 3: Create Users and Groups

It is best to run the various daemons with separate accounts. Three accounts (yarn, hdfs, mapred) in the group hadoop can be created as follows:

# groupadd hadoop

# useradd -g hadoop yarn

# useradd -g hadoop hdfs

# useradd -g hadoop mapred

Step 4: Make Data and Log Directories

Hadoop needs various data and log directories with various permissions. Enter the following lines to create these directories:

# mkdir -p /var/data/hadoop/hdfs/nn

# mkdir -p /var/data/hadoop/hdfs/snn

# mkdir -p /var/data/hadoop/hdfs/dn

# chown hdfs:hadoop /var/data/hadoop/hdfs –R

# mkdir -p /var/log/hadoop/yarn

# chown yarn:hadoop /var/log/hadoop/yarn -R

Next, move to the YARN installation root and create the log directory and set the owner and group as follows:

# cd /opt/yarn/hadoop-2.2.0

# mkdir logs
# chmod g+w logs

# chown yarn:hadoop . -R

Step 5: Configure core-site.xml

From the base of the Hadoop installation path (e.g., /opt/yarn/hadoop-2.2.0), edit the etc/hadoop/core-site.xml file. The original installed file will have no entries other than the<configuration></configuration>tags. Two properties need to be set. The first is the fs.default.name property, which sets the host and request port name for the NameNode (metadata server for HDFS). The second is hadoop.http.staticuser.user, which will set the default user name to hdfs. Copy the following lines to the Hadoop etc/hadoop/core-site.xml file and remove the original empty <configuration> </configuration> tags.

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

<property>

<name>hadoop.http.staticuser.user</name> <value>hdfs</value>

</property>

</configuration>

Step 6: Configure hdfs-site.xml

From the base of the Hadoop installation path, edit the etc/hadoop/hdfs-site.xml file. In the single-node pseudo-distributed mode, we don’t need or want the HDFS to replicate file blocks. By default, HDFS keeps three copies of each file in the file system for redundancy. There is no need for replication on a single machine; thus the value of dfs.replication will be set to 1.

In hdfs-site.xml, we specify the NameNode, Secondary NameNode, and Data- Node data directories that we created in Step 4. These are the directories used by the various components of HDFS to store data. Copy the following lines into Hadoop etc/hadoop/hdfs-site.xml and remove the original empty <configuration> </configuration> tags.

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value> </property> <property>

<name>dfs.namenode.name.dir</name>

<value>file:/var/data/hadoop/hdfs/nn</value> </property>

<property>

<name>fs.checkpoint.dir</name>

<value>file:/var/data/hadoop/hdfs/snn</value>

</property>

<property>

<name>fs.checkpoint.edits.dir</name> <value>file:/var/data/hadoop/hdfs/snn</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/var/data/hadoop/hdfs/dn</value>

</property>

</configuration>

 

Step 7: Configure mapred-site.xml

From the base of the Hadoop installation, edit the etc/hadoop/mapred-site.xml file. A new configuration option for Hadoop 2 is the capability to specify a framework name for MapReduce, setting the mapreduce.framework.name property. In this install, we will use the value of “yarn” to tell MapReduce that it will run as a YARN appli- cation. First, copy the template file to the mapred-site.xml.

# cp mapred-site.xml.template mapred-site.xml

Next, copy the following lines into Hadoop etc/hadoop/mapred-site.xml file and

remove the original empty <configuration> </configuration> tags.

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

</configuration>

 

Step 8: Configure yarn-site.xml

From the base of the Hadoop installation, edit the etc/hadoop/yarn-site.xml file. The yarn.nodemanager.aux-services property tells NodeManagers that there will be an auxiliary service called mapreduce.shuffle that they need to implement. After we tell the NodeManagers to implement that service, we give it a class name as the means to implement that service. This particular configuration tells MapReduce how to do its shuffle. Because NodeManagers won’t shuffle data for a non-MapReduce job by default, we need to configure such a service for MapReduce. Copy the following lines to the Hadoop etc/hadoop/yarn-site.xml file and remove the original empty <configuration> </configuration> tags.

<configuration>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value> </property>

</configuration>

Step 9: Modify Java Heap Sizes

The Hadoop installation uses several environment variables that determine the heap sizes for each Hadoop process. These are defined in the etc/hadoop/*-env.sh files used by Hadoop. The default for most of the processes is a 1 GB heap size; because we’re running on a workstation that will probably have limited resources compared to a standard server, however, we need to adjust the heap size settings. The values that follow are adequate for a small workstation or server.

Edit the etc/hadoop/hadoop-env.sh file to reflect the following (don’t forget to remove the “#” at the beginning of the line):

HADOOP_HEAPSIZE=”500″

HADOOP_NAMENODE_INIT_HEAPSIZE=”500″

Next, edit mapred-env.sh to ref lect the following:

HADOOP_JOB_HISTORYSERVER_HEAPSIZE=250

Finally, edit yarn-env.sh to ref lect the following:

JAVA_HEAP_MAX=-Xmx500m

The following line will need to be added to yarn-env.sh:

YARN_HEAPSIZE=500

Step 10: Format HDFS

For the HDFS NameNode to start, it needs to initialize the directory where it will hold its data. The NameNode service tracks all the metadata for the file sys- tem. The format process will use the value assigned to dfs.namenode.name.dir in etc/hadoop/hdfs-site.xml earlier (i.e., /var/data/hadoop/hdfs/nn). Format- ting destroys everything in the directory and sets up a new file system. Format the NameNode directory as the HDFS superuser, which is typically the “hdfs” user account.

From the base of the Hadoop distribution, change directories to the “bin” direc- tory and execute the following commands:

# su – hdfs

$ cd /opt/yarn/hadoop-2.2.0/bin

$ ./hdfs namenode -format

If the command worked, you should see the following near the end of a long list of messages:

INFO common.Storage: Storage directory /var/data/hadoop/hdfs/nn has been ➥successfully formatted.

 

Step 11: Start the HDFS Services

Once formatting is successful, the HDFS services must be started. There is one ser- vice for the NameNode (metadata server), a single DataNode (where the actual data
is stored), and the SecondaryNameNode (checkpoint data for the NameNode). The Hadoop distribution includes scripts that set up these commands as well as name other values such as PID directories, log directories, and other standard process configura- tions. From the bin directory in Step 10, execute the following as user hdfs:

$ cd ../sbin

$ ./hadoop-daemon.sh start namenode

The command should show the following:

starting namenode, logging to /opt/yarn/hadoop-2.2.0/logs/hadoop-hdfs-namenode- ➥limulus.out

The secondarynamenode and datanode services can be started in the same way:

$ ./hadoop-daemon.sh start secondarynamenode
starting secondarynamenode, logging to /opt/yarn/hadoop-2.2.0/logs/hadoop-hdfs- ➥secondarynamenode-limulus.out

$ ./hadoop-daemon.sh start datanode
starting datanode, logging to /opt/yarn/hadoop-2.2.0/logs/hadoop-hdfs-datanode- ➥limulus.out

If the daemon started successfully, you should see responses that will point to the log file. (Note that the actual log file is appended with “.log,” not “.out.”). As a sanity check, issue a jps command to confirm that all the services are running. The actual PID (Java Process ID) values will be different than shown in this listing:

$ jps

15140 SecondaryNameNode

15015 NameNode

15335 Jps

15214 DataNode

If the process did not start, it may be helpful to inspect the log files. For instance, examine the log file for the NameNode. (Note that the path is taken from the preced- ing command.)

vi /opt/yarn/hadoop-2.2.0/logs/hadoop-hdfs-namenode-limulus.log

All Hadoop services can be stopped using the hadoop-daemon.sh script. For example, to stop the datanode service, enter the following (as user hdfs in the /opt/yarn/hadoop-2.2.0/sbin directory):

$ ./hadoop-daemon.sh stop datanode

The same can be done for the NameNode and SecondaryNameNode.

Step 12: Start YARN Services

As with HDFS services, the YARN services need to be started. One ResourceManager and one NodeManager must be started as user yarn (exiting from user hdfs first):

$ exit

logout

# su – yarn

$ cd /opt/yarn/hadoop-2.2.0/sbin

$ ./yarn-daemon.sh start resourcemanager

starting resourcemanager, logging to /opt/yarn/hadoop-2.2.0/logs/yarn-yarn- ➥resourcemanager-limulus.out

$ ./yarn-daemon.sh start nodemanager
starting nodemanager, logging to /opt/yarn/hadoop-2.2.0/logs/yarn-yarn- ➥nodemanager-limulus.out

As when the HDFS daemons were started in Step 1, the status of the running dae- mons is sent to their respective log files. To check whether the services are running, issue a jps command. The following shows all the services necessary to run YARN on a single server:

$ jps

15933 Jps

15567 ResourceManager

15785 NodeManager

If there are missing services, check the log file for the specific service. Similar to the case with HDFS services, the services can be stopped by issuing a stop argument to the daemon script:

./yarn-daemon.sh stop nodemanager

Step 13: Verify the Running Services Using the Web Interface

Both HDFS and the YARN ResourceManager have a web interface. These interfaces are a convenient way to browse many of the aspects of your Hadoop installation. To monitor HDFS, enter the following (or use your favorite web browser):

$ firefox http://localhost:50070

Connecting to port 50070 will bring up a web interface similar to Figure 1.
1 .Web interface for the ResourceManager can be viewed by entering the following:

$ firefox http://localhost:8088

A webpage similar to that shown in Figure 1.2 will be displayed.

Ekran Resmi 2016-03-31 01.25.42

Figure 1.1 Webpage for HDFS file system

 

Ekran Resmi 2016-03-31 01.25.34

Figure 1.2 Webpage for YARN ResourceManager

 

How-to: Quickly Configure Kerberos for Your Apache Hadoop Cluster “http://blog.cloudera.com/blog/2015/03/how-to-quickly-configure-kerberos-for-your-apache-hadoop-cluster/”

March 11, 2016 Leave a comment

Use the scripts and screenshots below to configure a Kerberized cluster in minutes.

Kerberos is the foundation of securing your Apache Hadoop cluster. With Kerberos enabled, user authentication is required. Once users are authenticated, you can use projects like Apache Sentry (incubating) for role-based access control via GRANT/REVOKE statements.

Taming the three-headed dog that guards the gates of Hades is challenging, so Cloudera has put significant effort into making this process easier in Hadoop-based enterprise data hubs. In this post, you’ll learn how to stand-up a one-node cluster with Kerberos enforcing user authentication, using the Cloudera QuickStart VM as a demo environment.

If you want to read the product documentation, it’s available here. You should consider this reference material; I’d suggest reading it later to understand more details about what the scripts do.

Requirements

You need the following downloads to follow along.

Initial Configuration

Before you start the QuickStart VM, increase the memory allocation to 8GB RAM and increase the number of CPUs to two. You can get by with a little less RAM, but we will have everything including the Kerberos server running on one node.

Start up the VM and activate Cloudera Manager as shown here:

Give this script some time to run, it has to restart the cluster.

KDC Install and Setup Script

The script goKerberos_beforeCM.sh does all the setup work for the Kerberos server and the appropriate configuration parameters. The comments are designed to explain what is going on inline. (Do not copy and paste this script! It contains unprintable characters that are pretending to be spaces. Rather, download it.)

Cloudera Manager Kerberos Wizard

After running the script, you now have a working Kerberos server and can secure the Hadoop cluster. The wizard will do most of the heavy lifting; you just have to fill in a few values.

To start, log into Cloudera Manager by going to http://quickstart.cloudera:7180 in your browser. The userid is cloudera and the password is cloudera. (Almost needless to say but never use “cloudera” as a password in a real-world setting.)

There are lots of productivity tools here for managing the cluster but ignore them for now and head straight for the Administration > Kerberos wizard as shown in the next screenshot.

Click on the “Enable Kerberos” button.

The four checklist items were all completed by the script you’ve already run. Check off each item and select “Continue.”

The Kerberos Wizard needs to know the details of what the script configured. Fill in the entries as follows:

  • KDC Server Host: quickstart.cloudera
  • Kerberos Security Realm: CLOUDERA
  • Kerberos Encryption Types: aes256-cts-hmac-sha1-96

Click “Continue.”

Do you want Cloudera Manager to manage the krb5.conf files in your cluster? Remember, the whole point of this blog post is to make Kerberos easier. So, please check “Yes” and then select “Continue.”

The Kerberos Wizard is going to create Kerberos principals for the different services in the cluster. To do that it needs a Kerberos Administrator ID. The ID created is: cloudera-scm/admin@CLOUDERA.

The screen shot shows how to enter this information. Recall the password is: cloudera.

The next screen provides good news. It lets you know that the wizard was able to successfully authenticate.

OK, you’re ready to let the Kerberos Wizard do its work. Since this is a VM, you can safely select “I’m ready to restart the cluster now” and then click “Continue.” You now have time to go get a coffee or other beverage of your choice.

How long does that take? Just let it work.

Congrats, you are now running a Hadoop cluster secured with Kerberos.

Kerberos is Enabled. Now What?

The old method of su - hdfs will no longer provide administrator access to the HDFS filesystem. Here is how you become the hdfs user with Kerberos:

Now validate you can do hdfs user things:

Next, invalidate the Kerberos token so as not to break anything:

The min.user parameter needs to be fixed per the message below:

This is the error message you get without fixing min.user.id:

Save the changes shown above and restart the YARN service. Now validate that the cloudera user can use the cluster:

If you forget to kinit before trying to use the cluster you’ll get the errors below. The simple fix is to use kinit with the principal you wish to use.

Spark Streaming with HBase

January 25, 2016 Leave a comment

What is Spark Streaming?

First of all, what is streaming? A data stream is an unbounded sequence of data arriving continuously. Streaming divides continuously flowing input data into discrete units for processing. Stream processing is low latency processing and analyzing of streaming data. Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data. Spark Streaming is for use cases which require a significant amount of data to be quickly processed as soon as it arrives. Example real-time use cases are:

  • Website monitoring , Network monitoring
  • Fraud detection
  • Web clicks
  • Advertising
  • Internet of Things: sensors

Spark Streaming supports data sources such as HDFS directories, TCP sockets, Kafka, Flume, Twitter, etc. Data Streams can be processed with Spark’s core APIS, DataFrames SQL, or machine learning APIs, and can be persisted to a filesystem, HDFS, databases, or any data source offering a Hadoop OutputFormat.

How Spark Streaming Works

Streaming data is continuous and needs to be batched to process. Spark Streaming divides the data stream into batches of X seconds called Dstreams, which internally is a sequence of RDDs. Your Spark Application processes the RDDs using Spark APIs, and the processed results of the RDD operations are returned in batches.

Architecture of the example Streaming Application

The Spark Streaming example code does the following:

  • Reads streaming data.
  • Processes the streaming data.
  • Writes the processed data to an HBase Table.

Other Spark example code does the following:

  • Reads HBase Table data written by the streaming code
  • Calculates daily summary statistics
  • Writes summary statistics to the HBase table Column Family stats

Example data set

The Oil Pump Sensor data comes in as comma separated value (csv) files dropped in a directory. Spark Streaming will monitor the directory and process any files created in that directory. (As stated before, Spark Streaming supports different streaming data sources; for simplicity, this example will use files.) Below is an example of the csv file with some sample data:

We use a Scala case class to define the Sensor schema corresponding to the sensor data csv files, and a parseSensor function to parse the comma separated values into the sensor case class.

// schema for sensor data
case class Sensor(resid: String, date: String, time: String, hz: Double, disp: Double, flo: Double, 
          sedPPM: Double, psi: Double, chlPPM: Double)

object Sensor {
   // function to parse line of csv data into Sensor class
   def parseSensor(str: String): Sensor = {
       val p = str.split(",")
        Sensor(p(0), p(1), p(2), p(3).toDouble, p(4).toDouble, p(5).toDouble, p(6).toDouble,
            p(7).toDouble, p(8).toDouble)
  }
…
}

HBase Table schema

The HBase Table Schema for the streaming data is as follows:

  • Composite row key of the pump name date and time stamp
  • Column Family data with columns corresponding to the input data fields Column Family alerts with columns corresponding to any filters for alarming values Note that the data and alert column families could be set to expire values after a certain amount of time.

The Schema for the daily statistics summary rollups is as follows:

  • Composite row key of the pump name and date
  • Column Family stats
  • Columns for min, max, avg.

The function below converts a Sensor object into an HBase Put object, which is used to insert a row into HBase.

val cfDataBytes = Bytes.toBytes("data")

object Sensor {
. . .
  //  Convert a row of sensor object data to an HBase put object
  def convertToPut(sensor: Sensor): (ImmutableBytesWritable, Put) = {
      val dateTime = sensor.date + " " + sensor.time
      // create a composite row key: sensorid_date time
      val rowkey = sensor.resid + "_" + dateTime
      val put = new Put(Bytes.toBytes(rowkey))
      // add to column family data, column  data values to put object 
      put.add(cfDataBytes, Bytes.toBytes("hz"), Bytes.toBytes(sensor.hz))
      put.add(cfDataBytes, Bytes.toBytes("disp"), Bytes.toBytes(sensor.disp))
      put.add(cfDataBytes, Bytes.toBytes("flo"), Bytes.toBytes(sensor.flo))
      put.add(cfDataBytes, Bytes.toBytes("sedPPM"), Bytes.toBytes(sensor.sedPPM))
      put.add(cfDataBytes, Bytes.toBytes("psi"), Bytes.toBytes(sensor.psi))
      put.add(cfDataBytes, Bytes.toBytes("chlPPM"), Bytes.toBytes(sensor.chlPPM))
      return (new ImmutableBytesWritable(Bytes.toBytes(rowkey)), put)
  }
}

Configuration for Writing to an HBase Table

You can use the TableOutputFormat class with Spark to write to an HBase table, similar to how you would write to an HBase table from MapReduce. Below we set up the configuration for writing to HBase using the TableOutputFormat class.

   val tableName = "sensor"
     
   // set up Hadoop HBase configuration using TableOutputFormat
    val conf = HBaseConfiguration.create()
    conf.set(TableOutputFormat.OUTPUT_TABLE, tableName)
    val jobConfig: jobConfig = new JobConf(conf, this.getClass)
    jobConfig.setOutputFormat(classOf[TableOutputFormat])
    jobConfig.set(TableOutputFormat.OUTPUT_TABLE, tableName)

The Spark Streaming Example Code

These are the basic steps for Spark Streaming code:

  1. Initialize a Spark StreamingContext object.
  2. Apply transformations and output operations to DStreams.
  3. Start receiving data and processing it using streamingContext.start().
  4. Wait for the processing to be stopped using streamingContext.awaitTermination().

We will go through each of these steps with the example application code.

Initializing the StreamingContext

First we create a StreamingContext, the main entry point for streaming functionality, with a 2 second batch interval. (In the code boxes, comments are in Green)

val sparkConf = new SparkConf().setAppName("HBaseStream")

//  create a StreamingContext, the main entry point for all streaming functionality
val ssc = new StreamingContext(sparkConf, Seconds(2))

Next, we use the StreamingContext textFileStream(directory) method to create an input stream that monitors a Hadoop-compatible file system for new files and processes any files created in that directory.

// create a DStream that represents streaming data from a directory source
val linesDStream = ssc.textFileStream("/user/user01/stream")

The linesDStream represents the stream of data, each record is a line of text. Internally a DStream is a sequence of RDDs, one RDD per batch interval.

Apply transformations and output operations to DStreams

Next we parse the lines of data into Sensor objects, with the map operation on the linesDStream.

// parse each line of data in linesDStream  into sensor objects

val sensorDStream = linesDStream.map(Sensor.parseSensor) 

The map operation applies the Sensor.parseSensor function on the RDDs in the linesDStream, resulting in RDDs of Sensor objects.

Next we use the DStream foreachRDD method to apply processing to each RDD in this DStream. We filter the sensor objects for low psi to create alerts, then we write the sensor and alert data to HBase by converting them to Put objects, and using the PairRDDFunctions saveAsHadoopDatasetmethod, which outputs the RDD to any Hadoop-supported storage system using a Hadoop Configuration object for that storage system (see Hadoop Configuration for HBase above).

// for each RDD. performs function on each RDD in DStream
sensorRDD.foreachRDD { rdd =>
        // filter sensor data for low psi
     val alertRDD = rdd.filter(sensor => sensor.psi < 5.0)

      // convert sensor data to put object and write to HBase  Table CF data
      rdd.map(Sensor.convertToPut).saveAsHadoopDataset(jobConfig)

     // convert alert to put object write to HBase  Table CF alerts
     rdd.map(Sensor.convertToPutAlert).saveAsHadoopDataset(jobConfig)
}

The sensorRDD objects are converted to put objects then written to HBase.

Start receiving data

To start receiving data, we must explicitly call start() on the StreamingContext, then call awaitTermination to wait for the streaming computation to finish.

    // Start the computation
    ssc.start()
    // Wait for the computation to terminate
    ssc.awaitTermination()

Spark Reading from and Writing to HBase

Now we want to read the HBase sensor table data , calculate daily summary statistics and write these statistics to the stats column family.

The code below reads the HBase table sensor table psi column data, calculates statistics on this data using StatCounter, then writes the statistics to the sensor stats column family.

     // configure HBase for reading 
    val conf = HBaseConfiguration.create()
    conf.set(TableInputFormat.INPUT_TABLE, HBaseSensorStream.tableName)
    // scan data column family psi column
    conf.set(TableInputFormat.SCAN_COLUMNS, "data:psi") 

// Load an RDD of (row key, row Result) tuples from the table
    val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
      classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
      classOf[org.apache.hadoop.hbase.client.Result])

    // transform (row key, row Result) tuples into an RDD of Results
    val resultRDD = hBaseRDD.map(tuple => tuple._2)

    // transform into an RDD of (RowKey, ColumnValue)s , with Time removed from row key
    val keyValueRDD = resultRDD.
              map(result => (Bytes.toString(result.getRow()).
              split(" ")(0), Bytes.toDouble(result.value)))

    // group by rowkey , get statistics for column value
    val keyStatsRDD = keyValueRDD.
             groupByKey().
             mapValues(list => StatCounter(list))

    // convert rowkey, stats to put and write to hbase table stats column family
    keyStatsRDD.map { case (k, v) => convertToPut(k, v) }.saveAsHadoopDataset(jobConfig)

The diagram below shows that the output from newAPIHadoopRDD is an RDD of row key, result pairs. The PairRDDFunctions saveAsHadoopDataset saves the Put objects to HBase.

Software

Running the Application

You can run the code as a standalone application as described in the tutorial on Getting Started with Spark on MapR Sandbox.

Here are the steps summarized:

  1. Log into the MapR Sandbox, as explained in Getting Started with Spark on MapR Sandbox, using userid user01, password mapr.
  2. Build the application using maven.
  3. Copy the jar file and data file to your sandbox home directory /user/user01 using scp.
  4. Run the streaming app:
     /opt/mapr/spark/spark-1.3.1/bin/spark-submit --driver-class-path `hbase classpath` 
       --class examples.HBaseSensorStream sparkstreamhbaseapp-1.0.jar
    
  5. Copy the streaming data file to the stream directory:
    cp sensordata.csv /user/user01/stream/
  6. Read data and calculate stats for one column
       /opt/mapr/spark/spark-1.3.1/bin/spark-submit --driver-class-path `hbase classpath` 
        --class examples.HBaseReadWrite sparkstreamhbaseapp-1.0.jar
    
  7. Calculate stats for whole row
      /opt/mapr/spark/spark-1.3.1/bin/spark-submit --driver-class-path `hbase classpath` 
       --class examples.HBaseReadRowWriteStats sparkstreamhbaseapp-1.0.jar
Categories: #oracle_Emp, big data