NameNode, Secondary NameNode and Safe Mode – Hadoop

Hadoop Architecture is similar to the master-slave architecture. The master node is called the namenode and the slave nodes are called datanodes. The datanodes are also called as worker nodes. The namenode is the heart of the hadoop system and it manages the filesystem namespace. The namenode stores the directory, files and file to block mapping metadata on the local disk. The namenode stores this metadata in two files, the namespace image and the edit log.

Safe Mode

When you installed the hadoop, the image and edit log files will be empty. As soon as the clients starts writing the data, the namenode writes the file related metadata to the edit log and constructs the file to block and block to datanode mapping in the memory. The namenode does not write the block to data node mapping persistently on the local disk. It just maintains this mapping in the memory. When you restart the namenode, then the namenode goes into a safemode. In this safemode, the namenode merges all the edit log files and writes the metadata into the image file. The namenode then clears the edit log files and waits for the datanodes to reports the blocks. As soon as the datanodes starts reporting the blocks, the namenode constructs the block to datanode mapping in the memory. Once the datanodes completes reporting or cross a certain threshold, the namenode comes out of the safe mode. In safe mode the client cannot do any operations.

As the namenode writes the metadata into the memory, the number of files that you can store in hadoop is controlled by the namenode memory size. As a thumb rule, if the metadata of file takes 150 bytes, then to store one million files you need at least 300MB of ram on the namenode.

Without the namenode, you cannot access the hadoop system. If the namenode goes down then the complete hadoop system is inaccessible. So it is always better to take backup of the image and edit log files. Hadoop provides two ways for taking backup of the image and edit log files. One way is to use the secondary namenode and the second way is to write the files to a remote system.

Secondary NameNode

The secondary Namenode job is to periodically merge the image file with edit log file. This merge operation is to avoid the edit log file from growing into a large file. The secondary namenode runs on a separate machine and copies the image and edit log files periodically. The secondary namenode requires as much memory as the primary namenode. If the namenode crashes, then you can use the copied image and edit log files from secondary namenode and bring the primary namenode up. However, the state of secondary namenode lags from the primary namenode. So in case of namenode failure, the data loss is obvious.

Namenode and secondary namenode

Note: You cannot make the secondary namenode as the primary namenode.

Remote File System

You can configure the namenode to write the image and edit log files to multiple remote network file systems. These write operations are synchronous and atomic. The namenode first writes the metadata to the edit log on the local disk, then writes to the remote NFS system and then to the memory. Once all these operations are success, then the namenode commits the state. In case of any failure, you can copy the files from the remote NFS system and start the namenode. In this case, there is no loss of data.


Big Data Hadoop Architecture and Components

Hadoop is an apache open source software (java framework) which runs on a cluster of commodity machines. Hadoop provides both distributed storage and distributed processing of very large data sets. Hadoop is capable of processing big data of sizes ranging from Gigabytes to Petabytes.

Hadoop architecture is similar to master/slave architecture. The architecture of hadoop is shown in the below diagram:

Hadoop MRV1 Architecture:


Hadoop Architecture

Hadoop Architecture Overview:

Hadoop is a master/ slave architecture. The master being the namenode and slaves are datanodes. The namenode controls the access to the data by clients. The datanodes manage the storage of data on the nodes that are running on. Hadoop splits the file into one or more blocks and these blocks are stored in the datanodes. Each data block is replicated to 3 different datanodes to provide high availability of the hadoop system. The block replication factor is configurable.

Hadoop Components:

The major components of hadoop are:

  • Hadoop Distributed File System: HDFS is designed to run on commodity machines which are of low cost hardware. The distributed data is stored in the HDFS file system. HDFS is highly fault tolerant and provides high throughput access to the applications that require big data.
  • Namenode: Namenode is the heart of the hadoop system. The namenode manages the file system namespace. It stores the metadata information of the data blocks. This metadata is stored permanently on to local disk in the form of namespace image and edit log file. The namenode also knows the location of the data blocks on the data node. However the namenode does not store this information persistently. The namenode creates the block to datanode mapping when it is restarted. If the namenode crashes, then the entire hadoop system goes down.
  • Secondary Namenode: The responsibility of secondary name node is to periodically copy and merge the namespace image and edit log. In case if the name node crashes, then the namespace image stored in secondary namenode can be used to restart the namenode.
  • DataNode: It stores the blocks of data and retrieves them. The datanodes also reports the blocks information to the namenode periodically.
  • JobTracker: JobTracker responsibility is to schedule the clients jobs. Job tracker creates map and reduce tasks and schedules them to run on the datanodes (tasktrackers). Job Tracker also checks for any failed tasks and reschedules the failed tasks on another datanode. Jobtracker can be run on the namenode or a separate node.
  • TaskTracker: Tasktracker runs on the datanodes. Task trackers responsibility is to run the the map or reduce tasks assigned by the namenode and to report the status of the tasks to the namenode.

Hadoop fs Shell Commands Examples – Baby Steps

Hadoop file system (fs) shell commands are used to perform various file operations like copying file, changing permissions, viewing the contents of the file, changing ownership of files, creating directories etc.

The syntax of fs shell command is

hadoop fs <args>

All the fs shell commands takes the path URI as arguments. The format of URI is sheme://authority/path. The scheme and authority are optional. For hadoop the scheme is hdfs and for local file system the scheme is file. IF you do not specify a scheme, the default scheme is taken from the configuration file. You can also specify the directories in hdfs along with the URI as hdfs://namenodehost/dir1/dir2 or simple /dir1/dir2.

The hadoop fs commands are almost similar to the unix commands. Let see each of the fs shell commands in detail with examples:

Hadoop fs Shell Commands

hadoop fs ls:

The hadoop ls command is used to list out the directories and files. An example is shown below:

> hadoop fs -ls /user/hadoop/employees
Found 1 items
-rw-r--r--   2 hadoop hadoop 2 2012-06-28 23:37 /user/hadoop/employees/000000_0

The above command lists out the files in the employees directory.

> hadoop fs -ls /user/hadoop/dir
Found 1 items
drwxr-xr-x   - hadoop hadoop  0 2013-09-10 09:47 /user/hadoop/dir/products

The output of hadoop fs ls command is almost similar to the unix ls command. The only difference is in the second field. For a file, the second field indicates the number of replicas and for a directory, the second field is empty.

hadoop fs lsr:

The hadoop lsr command recursively displays the directories, sub directories and files in the specified directory. The usage example is shown below:

> hadoop fs -lsr /user/hadoop/dir
Found 2 items
drwxr-xr-x   - hadoop hadoop  0 2013-09-10 09:47 /user/hadoop/dir/products
-rw-r--r--   2 hadoop hadoop    1971684 2013-09-10 09:47 /user/hadoop/dir/products/products.dat

The hadoop fs lsr command is similar to the ls -R command in unix.

hadoop fs cat:

Hadoop cat command is used to print the contents of the file on the terminal (stdout). The usage example of hadoop cat command is shown below:

> hadoop fs -cat /user/hadoop/dir/products/products.dat

cloudera book by amazon
cloudera tutorial by ebay

hadoop fs chgrp:

hadoop chgrp shell command is used to change the group association of files. Optionally you can use the -R option to change recursively through the directory structure. The usage of hadoop fs -chgrp is shown below:

hadoop fs -chgrp [-R] <NewGroupName> <file or directory name>

hadoop fs chmod:

The hadoop chmod command is used to change the permissions of files. The -R option can be used to recursively change the permissions of a directory structure. The usage is shown below:

hadoop fs -chmod [-R] <mode | octal mode> <file or directory name>

hadoop fs chown:

The hadoop chown command is used to change the ownership of files. The -R option can be used to recursively change the owner of a directory structure. The usage is shown below:

hadoop fs -chown [-R] <NewOwnerName>[:NewGroupName] <file or directory name>

hadoop fs mkdir:

The hadoop mkdir command is for creating directories in the hdfs. You can use the -p option for creating parent directories. This is similar to the unix mkdir command. The usage example is shown below:

> hadoop fs -mkdir /user/hadoop/hadoopdemo

The above command creates the hadoopdemo directory in the /user/hadoop directory.

> hadoop fs -mkdir -p /user/hadoop/dir1/dir2/demo

The above command creates the dir1/dir2/demo directory in /user/hadoop directory.

hadoop fs copyFromLocal:

The hadoop copyFromLocal command is used to copy a file from the local file system to the hadoop hdfs. The syntax and usage example are shown below:

hadoop fs -copyFromLocal <localsrc> URI


Check the data in local file
> ls sales
2001, htc

Now copy this file to hdfs

> hadoop fs -copyFromLocal sales /user/hadoop/hadoopdemo

View the contents of the hdfs file.

> hadoop fs -cat /user/hadoop/hadoopdemo/sales
2001, htc

hadoop fs copyToLocal:

The hadoop copyToLocal command is used to copy a file from the hdfs to the local file system. The syntax and usage example is shown below:

hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst>


hadoop fs -copyToLocal /user/hadoop/hadoopdemo/sales salesdemo

The -ignorecrc option is used to copy the files that fail the crc check. The -crc option is for copying the files along with their CRC.

hadoop fs cp:

The hadoop cp command is for copying the source into the target. The cp command can also be used to copy multiple files into the target. In this case the target should be a directory. The syntax is shown below:

hadoop fs -cp /user/hadoop/SrcFile /user/hadoop/TgtFile
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 hdfs://namenodehost/user/hadoop/TgtDirectory

hadoop fs -put:

Hadoop put command is used to copy multiple sources to the destination system. The put command can also read the input from the stdin. The different syntaxes for the put command are shown below:

Syntax1: copy single file to hdfs

hadoop fs -put localfile /user/hadoop/hadoopdemo

Syntax2: copy multiple files to hdfs

hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdemo

Syntax3: Read input file name from stdin
hadoop fs -put - hdfs://namenodehost/user/hadoop/hadoopdemo

hadoop fs get:

Hadoop get command copies the files from hdfs to the local file system. The syntax of the get command is shown below:

hadoop fs -get /user/hadoop/hadoopdemo/hdfsFileName localFileName

hadoop fs getmerge:

hadoop getmerge command concatenates the files in the source directory into the destination file. The syntax of the getmerge shell command is shown below:

hadoop fs -getmerge <src> <localdst> [addnl]

The addnl option is for adding new line character at the end of each file.

hadoop fs moveFromLocal:

The hadoop moveFromLocal command moves a file from local file system to the hdfs directory. It removes the original source file. The usage example is shown below:

> hadoop fs -moveFromLocal products /user/hadoop/hadoopdemo

hadoop fs mv:

It moves the files from source hdfs to destination hdfs. Hadoop mv command can also be used to move multiple source files into the target directory. In this case the target should be a directory. The syntax is shown below:

hadoop fs -mv /user/hadoop/SrcFile /user/hadoop/TgtFile
hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2 hdfs://namenodehost/user/hadoop/TgtDirectory

hadoop fs du:

The du command displays aggregate length of files contained in the directory or the length of a file in case its just a file. The syntax and usage is shown below:

hadoop fs -du hdfs://namenodehost/user/hadoop

hadoop fs dus:

The hadoop dus command prints the summary of file lengths

> hadoop fs -dus hdfs://namenodehost/user/hadoop
hdfs://namenodehost/user/hadoop 21792568333

hadoop fs expunge:

Used to empty the trash. The usage of expunge is shown below:

hadoop fs -expunge

hadoop fs rm:

Removes the specified list of files and empty directories. An example is shown below:

hadoop fs -rm /user/hadoop/file

hadoop fs -rmr:

Recursively deletes the files and sub directories. The usage of rmr is shown below:

hadoop fs -rmr /user/hadoop/dir

hadoop fs setrep:

Hadoop setrep is used to change the replication factor of a file. Use the -R option for recursively changing the replication factor.

hadoop fs -setrep -w 4 -R /user/hadoop/dir

hadoop fs stat:

Hadoop stat returns the stats information on a path. The syntax of stat is shown below:

hadoop fs -stat URI

> hadoop fs -stat /user/hadoop/
2013-09-24 07:53:04

hadoop fs tail:

Hadoop tail command prints the last kilobytes of the file. The -f option can be used same as in unix.

> hafoop fs -tail /user/hadoop/sales.dat

12345 abc
2456 xyz

hadoop fs test:

The hadoop test is used for file test operations. The syntax is shown below:

hadoop fs -test -[ezd] URI

Here “e” for checking the existence of a file, “z” for checking the file is zero length or not, “d” for checking the path is a directory or no. On success, the test command returns 1 else 0.

hadoop fs text:

The hadoop text command displays the source file in text format. The allowed source file formats are zip and TextRecordInputStream. The syntax is shown below:

hadoop fs -text <src>

hadoop fs touchz:

The hadoop touchz command creates a zero byte file. This is similar to the touch command in unix. The syntax is shown below:

hadoop fs -touchz /user/hadoop/filename

Exploring Apache Spark on the New BigDataLite 3.0 VM “”

The latest version of Oracle’s BigDataLite VirtualBox VM went up on OTN last week, and amongst other things it includes the latest CDH5.0 version of Cloudera Distribution including Hadoop, as featured on Oracle Big Data Appliance. This new version comes with an update to MapReduce, moving it to MapReduce 2.0 (with 1.0 still there for backwards-compatibility) and with YARN as the replacement for the Hadoop JobTracker. If you’re developing on CDH or the BigDataLite VM you shouldn’t notice any differences with the move to YARN, but it’s a more forward-looking, modular way of tracking and allocating resources on these types of compute clusters that also opens them up to processing models other than MapReduce.

The other new Hadoop feature that you’ll notice with BigDataLite VM and CDH5 is an updated version of Hue, the web-based development environment you use for creating Hive queries, uploading files and so on. As the version of Hue shipped is now Hue 3.5, there’s proper support for quoted CSV files in the Hue / Hive uploader (hooray) along with support for stripping the first, title, line, and an updated Pig editor that prompts you for command syntax (like the Hortonworks Pig editor).


This new version of BigDataLite also seems to have had Cloudera Manager removed (or at least, it’s not available as usual at http://bigdatalite:7180), with instead a utility provided on the desktop that allows you to stop and start the various VM services, including the Oracle Database and Oracle NoSQL database that also come with the VM. Strictly-speaking its actually easier to use than Cloudera Manager, but it’s a shame it’s gone as there’s lots of monitoring and configuration tools in the product that I’ve found useful in the past.


CDH5 also comes with Apache Spark, a cluster processing framework that’s being positioned as the long-term replacement for MapReduce. Spark is technically a programming model that allows developers to create scripts, or programs, that bring together operators such as filters, aggregators, joiners and group-bys using languages such as Scala and Python, but crucially this can all happen in-memory – making Spark potentially much faster than MapReduce for doing both batch, and ad-hoc analysis.

This article on the Cloudera blog goes into more detail on what Apache Spark is and how it improves over MapReduce, and this article takes a look at the Spark architecture and how it’s approach avoids the multi-stage execution model that MapReduce uses (something you’ll see if you ever do a join in Pig or Hive). But what does some basic Spark code look like, using the default Scala language most people associate Spark with? Let’s take a look at some sample code, using the same Rittman Mead Webserver log files I used in the previous Pig and Hive/Impala articles.

You can start up Spark in interactive mode, like you do with Pig and Grunt, by opening a Terminal session and typing in “spark-shell”:

Spark has the concept of RDDs, “Resilient Distributed Datasets” that can be thought of as similar to relations in Pig, and tables in Hive, but which crucially can be cached in RAM for improved performance when you need to access their dataset repeatedly. Like Pig, Spark features “lazy execution”, only processing the various Spark commands when you actually need to (for example, when outputting the results of a data-flow”, so let’s run two more commands to load in one of the log files on HDFS, and then count the log file lines within it.

So the file contains 154563 records. Running the logfile.count() command again, though, brings back the count immediately as the RDD has been cached; we can explicitly cache RDDs directly in these commands if we like, by using the “.cache” method:

So let’s try some filtering, retrieving just those log entries where the user is requesting our BI Apps 11g homepage (“/biapps11g/“):

Or I can create a dataset containing just those records that have a 404 error code:

Spark, using Scala as the language, has routines for filtering, joining, splitting and otherwise transforming data, but something that’s quite common in Spark is to create Java JAR files, typically from compiled Scala code, to encapsulate certain common data transformations, such as this Apache CombinedFormat Log File parser available on Github from @alvinalexander, author of the Scala Cookbook. Once you’ve compiled this into a JAR file and added it to your SPARK_CLASSPATH (see his blog post for full details, and from where the Spark examples below were taken from), you can start to work with the individual log file elements, like we did in the Hive and Pig examples where we parsed the log file using Regexes.

Then I can access the HTTP Status Code using its own property, like this:

Then we can use a similar method to retrieve all of the “request” entries in the log file where the user got a 404 error, starting off by defining two methods that will help with the request parsing – note the use of the :paste command which allows you to block-paste a set of commands into the scala-shell:

Now we can run the code to output the URIs that have been generating 404 errors:

So Spark is commonly-considered the successor to MapReduce, and you can start playing around with it on the new BigDataLite VM. Unless you’re a Java (or Scala, or Python) programmer, Spark isn’t quite as easy as Pig or Hive to get into, but the potential benefits over MapReduce are impressive and it’d be worth taking a look. Hopefully we’ll have more on Spark on the blog over the next few months, as we get to grips with it properly.

Relational to JSON with Node.js(

Node.js is a platform for building JavaScript based applications that has become very popular over the last few years. To best support this rapidly growing community, Oracle developed a new driver for Node.js which was released in January of this year as a preview release. Written in C and utilizing the Oracle Instant Client libraries, the Node.js driver is both performant and feature rich.

When it comes to generating JSON, Node.js seems like a natural choice as JSON is based on JavaScript objects. However, it’s not exactly fair to compare a solution crafted in Node.js to the PL/SQL solutions as Node.js has a clear disadvantage: it runs on a separate server. That means we have to bring the data across the network before we can transform it to the desired output. Whatever, let’s compare them anyway! 😀

Please Note: This post is part of a series on generating JSON from relational data in Oracle Database. See that post for details on the solutions implemented below as well as other options that can be used to achieve that goal.

Solution 1 – Emulating the PL/SQL solutions

This first solution follows the same flow as the PL/SQL based solutions. We start off by making a connection to the database and then begin creating the object we need by fetching data from the departments table. Next we go to the locations table, then the regions table, and on and on until finally we have the object we want. We then use JSON.stringify to serialize the object into the JSON result we are after.


When called with the department id set to 10, the function returns the serialized JSON that matches the goal 100%.

So we got the output we were after, but what about performance? Let’s explore the test results…

Test results

When I finished the solutions for PL/JSON and APEX_JSON, I wanted to see if one was faster than the other. I ran a very simple test: I generated the JSON for all 27 departments in the HR schema 100 times in a loop (2,700 invocations of the solution). APEX_JSON finished in around 3.5 seconds while PL/JSON took 17 seconds. How long did it take the Node.js solution from above to do this? 136 seconds! What??? Node.js must be slow, right? Well, not exactly…

You may have noticed that, on line 5, I used the base driver class to get a connection to the database. If I was just doing this once the code would be fine as is, but 2,700 times? That’s not good. In fact, that’s really bad! But what’s a Node.js developer to do?

Using a connection pool

The Node.js driver has supported connection pools from the beginning. The idea is that, rather than incur the cost of making a connection to the database each time we need one, we’ll just grab a connection from a pool of connections that have already been established and release it back to the pool when we’re done with it. Best of all, this is really simple!

Here’s a new module that I’ll use to create, store, and fetch the connection pool:

With that in place, I can update the test code to open the connection pool prior to starting the test. Then I just need to update the solution to use the connection pool:

That’s it! I just swapped out the driver module for the pool module and used it to get a connection instead. How much did using the connection pool help? With that one simple change the test completed in 21.5 seconds! Wow, that’s just a little longer than it took PL/JSON. So Node.js is still slow, right? Well, not exactly… 😉

Solution 2 – Optimizing for Node.js

Remember when I said that the solution mimicked the PL/SQL code? There are two major problems with doing this in Node.js:

  1. Excessive round trips: Each query we’re executing is a round trip to the database. In PL/SQL, this is just a context switch from the PL/SQL to the SQL engine and back. I’m not saying we shouldn’t minimize context switches in PL/SQL, we should, but in Node.js this is a round trip across the network! By using the database to do some joins we can reduce the number of queries from 7 to 3.
  2. Sequential execution: The way the PL/SQL solutions were written, a query was executed, the results were processed, and then the next query was executed. This sequence was repeated until all work was complete. It would have been nice to be able to do some of this work in parallel. While Oracle does have some options for doing work in parallel, such as Parallel Execution and DBMS_PARALLEL_EXECUTE, PL/SQL is for the most part a single threaded environment. I could probably have worked some magic to execute the queries and process the results in parallel, perhaps via scheduled jobs or Advanced Queues, but it would have been difficult using PL/SQL.

    However, this is an area where Node.js shines! Doing work in parallel is quite easy with the the async module. In the following solution, I use async’s parallel method to fire off three functions at the same time: the first builds the basic department object, the second builds the employees array, and the last builds the jobHistory array. The final function, which is fired when all the others have completed, puts the results of the first three functions together into a single object before returning the JSON.

Here’s the solution optimized for Node.js:

How fast did that solution finish the test? 7.8 seconds! Not too shabby! That’s faster than the PL/JSON solution but not quite as fast as APEX_JSON. But could we optimize even further?

Client Result Caching

Because I was running my tests on Oracle XE I wasn’t able to do a final optimization that would have been possible with the Enterprise Edition of the database: Client Result Caching. With Client Result Caching, Oracle can automatically maintain a cache of the data on the server where Node.js is running. This could have eliminated some round trips and data having to move across the network. I’ll revisit this feature in a future post were we can explore it in detail.