Change exadata flashcache mode from WriteThrough to WriteBack

We can enable the WriteBack mode for flashcaches without shutting down the ASM or any instance on the Exadata. This will be done in a rolling fashion one cell disk at a time. You have to make sure you finish one cell before starting the operation on the next cell.

I will follow the following document to change the flashcache mode in a Exadata Database Machine X3-2 Eighth Rack System. Also you can find why we need to use WriteBack mode and other useful information in that document.

Exadata Write-Back Flash Cache – FAQ (Doc ID 1500257.1)

4. How to determine if you have write back flash cache enabled?

[root@testdbadm01 ~]# dcli -g ~/cell_group -l root cellcli -e "list 
cell attributes flashcachemode"
testceladm01: WriteThrough
testceladm02: WriteThrough
testceladm03: WriteThrough

5. How can we enable the write back flash cache?

Before proceeding any further, make sure all the griddisks are online and there are no problems on the cells.

[root@testdbadm01 ~]#  dcli -g cell_group -l root cellcli -e list 
griddisk attributes asmdeactivationoutcome, asmmodestatus
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm01: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm02: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE
testceladm03: Yes        ONLINE

On the 1/8th Rack system flashcache is reduced to half size. Becuase of that the size appears to be 744 GB.

[root@testdbadm01 ~]#  dcli -g cell_group -l root cellcli -e list flashcache detail
testceladm01: name:                      testceladm01_FLASHCACHE
testceladm01: cellDisk:                  FD_03_testceladm01,FD_02_testceladm01,FD_00_testceladm01,FD_01_testceladm01,FD_06_testceladm01,FD_05_testceladm01,FD_04_testceladm01,FD_07_testceladm01
testceladm01: creationTime:              2013-07-29T17:55:28+03:00
testceladm01: degradedCelldisks:
testceladm01: effectiveCacheSize:        744.125G
testceladm01: id:                        5adas74d-asdc-4477-382d-30c14052c23d
testceladm01: size:                      744.125G
testceladm01: status:                    normal
testceladm02: name:                      testceladm02_FLASHCACHE
testceladm02: cellDisk:                  FD_03_testceladm02,FD_07_testceladm02,FD_02_testceladm02,FD_00_testceladm02,FD_06_testceladm02,FD_01_testceladm02,FD_04_testceladm02,FD_05_testceladm02
testceladm02: creationTime:              2013-07-29T17:55:40+03:00
testceladm02: degradedCelldisks:
testceladm02: effectiveCacheSize:        744.125G
testceladm02: id:                        80c140a3-0c14-40ba-8364-10c1460d802f
testceladm02: size:                      744.125G
testceladm02: status:                    normal
testceladm03: name:                      testceladm03_FLASHCACHE
testceladm03: cellDisk:                  FD_06_testceladm03,FD_03_testceladm03,FD_00_testceladm03,FD_04_testceladm03,FD_05_testceladm03,FD_02_testceladm03,FD_01_testceladm03,FD_07_testceladm03
testceladm03: creationTime:              2013-07-29T17:55:25+03:00
testceladm03: degradedCelldisks:
testceladm03: effectiveCacheSize:        744.125G
testceladm03: id:                        6950c148-992b-4347-a1e1-e60c1406bda6
testceladm03: size:                      744.125G
testceladm03: status:                    normal

Repeat the following procedure for all the cells. We have 3 cells on 1/8 the rack exadata. Operate only on one cell at a time. Before proceeding the next cell, make sure the cell is operational. I will follow the Support ID but I prefer to use the cellcli commands in the cellcli command prompt.

Also I prefer to do any disk operation during off peak hours.

1. Drop the flash cache on that cell

CellCLI>  drop flashcache
Flash cache testceladm01_FLASHCACHE successfully dropped

2. Check if ASM will be OK if the grid disks go OFFLINE. The following command should return ‘Yes’ for the grid disks being listed:

CellCLI> list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
         DATA_TEST_CD_00_testceladm01    ONLINE  Yes
         DATA_TEST_CD_01_testceladm01    ONLINE  Yes
         DATA_TEST_CD_02_testceladm01    ONLINE  Yes
         DATA_TEST_CD_03_testceladm01    ONLINE  Yes
         DATA_TEST_CD_04_testceladm01    ONLINE  Yes
         DATA_TEST_CD_05_testceladm01    ONLINE  Yes
         DBFS_DG_CD_02_testceladm01      ONLINE  Yes
         DBFS_DG_CD_03_testceladm01      ONLINE  Yes
         DBFS_DG_CD_04_testceladm01      ONLINE  Yes
         DBFS_DG_CD_05_testceladm01      ONLINE  Yes
         RECO_TEST_CD_00_testceladm01    ONLINE  Yes
         RECO_TEST_CD_01_testceladm01    ONLINE  Yes
         RECO_TEST_CD_02_testceladm01    ONLINE  Yes
         RECO_TEST_CD_03_testceladm01    ONLINE  Yes
         RECO_TEST_CD_04_testceladm01    ONLINE  Yes
         RECO_TEST_CD_05_testceladm01    ONLINE  Yes

3. Inactivate the griddisk on the cell

CellCLI> alter griddisk all inactive
GridDisk DATA_TEST_CD_00_testceladm01 successfully altered
GridDisk DATA_TEST_CD_01_testceladm01 successfully altered
GridDisk DATA_TEST_CD_02_testceladm01 successfully altered
GridDisk DATA_TEST_CD_03_testceladm01 successfully altered
GridDisk DATA_TEST_CD_04_testceladm01 successfully altered
GridDisk DATA_TEST_CD_05_testceladm01 successfully altered
GridDisk DBFS_DG_CD_02_testceladm01 successfully altered
GridDisk DBFS_DG_CD_03_testceladm01 successfully altered
GridDisk DBFS_DG_CD_04_testceladm01 successfully altered
GridDisk DBFS_DG_CD_05_testceladm01 successfully altered
GridDisk RECO_TEST_CD_00_testceladm01 successfully altered
GridDisk RECO_TEST_CD_01_testceladm01 successfully altered
GridDisk RECO_TEST_CD_02_testceladm01 successfully altered
GridDisk RECO_TEST_CD_03_testceladm01 successfully altered
GridDisk RECO_TEST_CD_04_testceladm01 successfully altered
GridDisk RECO_TEST_CD_05_testceladm01 successfully altered

4. Shut down cellsrv service

CellCLI> alter cell shutdown services cellsrv 

Stopping CELLSRV services... 
The SHUTDOWN of CELLSRV services was successful.

5. Set the cell flashcache mode to writeback

CellCLI> alter cell flashCacheMode=writeback
Cell testceladm01 successfully altered

6. Restart the cellsrv service

CellCLI> alter cell startup services cellsrv

Starting CELLSRV services...
The STARTUP of CELLSRV services was successful.

7. Reactivate the griddisks on the cell

CellCLI> alter griddisk all active
GridDisk DATA_TEST_CD_00_testceladm01 successfully altered
GridDisk DATA_TEST_CD_01_testceladm01 successfully altered
GridDisk DATA_TEST_CD_02_testceladm01 successfully altered
GridDisk DATA_TEST_CD_03_testceladm01 successfully altered
GridDisk DATA_TEST_CD_04_testceladm01 successfully altered
GridDisk DATA_TEST_CD_05_testceladm01 successfully altered
GridDisk DBFS_DG_CD_02_testceladm01 successfully altered
GridDisk DBFS_DG_CD_03_testceladm01 successfully altered
GridDisk DBFS_DG_CD_04_testceladm01 successfully altered
GridDisk DBFS_DG_CD_05_testceladm01 successfully altered
GridDisk RECO_TEST_CD_00_testceladm01 successfully altered
GridDisk RECO_TEST_CD_01_testceladm01 successfully altered
GridDisk RECO_TEST_CD_02_testceladm01 successfully altered
GridDisk RECO_TEST_CD_03_testceladm01 successfully altered
GridDisk RECO_TEST_CD_04_testceladm01 successfully altered
GridDisk RECO_TEST_CD_05_testceladm01 successfully altered

8. Verify all grid disks have been successfully put online using the following command:

(Currently DATA_TEST diskgroup started syncronization)

CellCLI> list griddisk attributes name, asmmodestatus
DATA_TEST_CD_00_testceladm01 SYNCING
DATA_TEST_CD_01_testceladm01 SYNCING
DATA_TEST_CD_02_testceladm01 SYNCING
DATA_TEST_CD_03_testceladm01 SYNCING
DATA_TEST_CD_04_testceladm01 SYNCING
DATA_TEST_CD_05_testceladm01 SYNCING
DBFS_DG_CD_02_testceladm01 OFFLINE
DBFS_DG_CD_03_testceladm01 OFFLINE
DBFS_DG_CD_04_testceladm01 OFFLINE
DBFS_DG_CD_05_testceladm01 OFFLINE
RECO_TEST_CD_00_testceladm01 OFFLINE
RECO_TEST_CD_01_testceladm01 OFFLINE
RECO_TEST_CD_02_testceladm01 OFFLINE
RECO_TEST_CD_03_testceladm01 OFFLINE
RECO_TEST_CD_04_testceladm01 OFFLINE
RECO_TEST_CD_05_testceladm01 OFFLINE

9. Recreate the flash cache

CellCLI> create flashcache all
Flash cache testceladm01_FLASHCACHE successfully created

10. Check the status of the cell to confirm that it’s now in WriteBack mode:

CellCLI> list cell attributes flashCacheMode
writeback

11. Repeat these same steps again on the next cell. However, before taking another storage server offline, execute the following making sure ‘asmdeactivationoutcome’ displays YES: (Currently RECO_TEST diskgroup started synchronization)

CellCLI> list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
DATA_TEST_CD_00_testceladm01 ONLINE Yes
DATA_TEST_CD_01_testceladm01 ONLINE Yes
DATA_TEST_CD_02_testceladm01 ONLINE Yes
DATA_TEST_CD_03_testceladm01 ONLINE Yes
DATA_TEST_CD_04_testceladm01 ONLINE Yes
DATA_TEST_CD_05_testceladm01 ONLINE Yes
DBFS_DG_CD_02_testceladm01 ONLINE Yes
DBFS_DG_CD_03_testceladm01 ONLINE Yes
DBFS_DG_CD_04_testceladm01 ONLINE Yes
DBFS_DG_CD_05_testceladm01 ONLINE Yes
RECO_TEST_CD_00_testceladm01 SYNCING Yes
RECO_TEST_CD_01_testceladm01 SYNCING Yes
RECO_TEST_CD_02_testceladm01 SYNCING Yes
RECO_TEST_CD_03_testceladm01 SYNCING Yes
RECO_TEST_CD_04_testceladm01 SYNCING Yes
RECO_TEST_CD_05_testceladm01 SYNCING Yes

All disk groups are synchronized now we can proceed on the next cell.

CellCLI> list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
DATA_TEST_CD_00_testceladm01 ONLINE Yes
DATA_TEST_CD_01_testceladm01 ONLINE Yes
DATA_TEST_CD_02_testceladm01 ONLINE Yes
DATA_TEST_CD_03_testceladm01 ONLINE Yes
DATA_TEST_CD_04_testceladm01 ONLINE Yes
DATA_TEST_CD_05_testceladm01 ONLINE Yes
DBFS_DG_CD_02_testceladm01 ONLINE Yes
DBFS_DG_CD_03_testceladm01 ONLINE Yes
DBFS_DG_CD_04_testceladm01 ONLINE Yes
DBFS_DG_CD_05_testceladm01 ONLINE Yes
RECO_TEST_CD_00_testceladm01 ONLINE Yes
RECO_TEST_CD_01_testceladm01 ONLINE Yes
RECO_TEST_CD_02_testceladm01 ONLINE Yes
RECO_TEST_CD_03_testceladm01 ONLINE Yes
RECO_TEST_CD_04_testceladm01 ONLINE Yes
RECO_TEST_CD_05_testceladm01 ONLINE Yes

FINALLY
After changing the flashcache modes on all cells, check if flashcache modes are changed to write-back for all cells.

[root@testdbadm01 ~]# dcli -g ~/cell_group -l root cellcli -e "list cell attributes flashcachemode"
testceladm01: writeback
testceladm02: writeback
testceladm03: writeback
Advertisements

Using the dcli Utility

7 Using the dcli Utility

The dcli utility executes commands across a group of servers on Oracle and returns the output.

This chapter contains the following sections:

1 Overview of the dcli Utility

The dcli utility executes commands on multiple Oracle Big Data Appliance servers in parallel, using the InfiniBand (bondib0) interface to make the connections. You can run the utility from any server.

1.1 Setting Up Passwordless SSH

The dcli utility requires a passwordless Secure Shell (SSH) between the local server and all target servers. You run the dcli utility on the local server, and the commands specified in dcli execute on the target servers.

Two scripts facilitate the use of SSH on Oracle Big Data Appliance: setup-root-ssh and remove-root-ssh. These scripts accept two options that are also used by dcli:

  • -C: Targets all the servers in a Hadoop cluster
  • -g: Targets a user-defined set of servers

To set up passwordless SSH for root:

  1. Connect to an Oracle Big Data Appliance server using PuTTY or a similar utility. Select an SSH connection type.
  2. Log in as root.
  3. Set up passwordless SSH for root across the rack:
    setup-root-ssh
    

    Or, to set up passwordless SSH across a Hadoop cluster of multiple racks:

    setup-root-ssh -C
    

    You see the message “ssh key added” from each server.

  4. You can now run any ssh command on any server in the rack without entering a password. In addition to dcli commands, you can use scp to copy files between servers.
  5. To remove passwordless SSH from root:
    remove-root-ssh
    

1.2 Basic Use of dcli

This topic identifies some of the basic options to the dcli command.

1.2.1 Getting Help

To see the dcli help page, issue the dcli command with the -h or --help options. You can see a description of the commands by issuing the dclicommand with no options.

1.2.2 Identifying the Target Servers

You can identify the servers where you want the commands to run either in the command line or in a file. For a list of default target servers, use the -toption. To change the target servers for the current command, use the -c or -g options described in below table.

The /opt/oracle/bda directory contains two files for executing commands on multiple servers:

  • rack-hosts-infiniband is the default target group of servers for the dclisetup-root-ssh, and remove-root-ssh utilities. The file initially contains the default factory IP addresses. The network configuration process changes this file to the custom IP addresses identified in the Oracle Big Data Appliance Configuration Worksheets.
  • cluster-hosts-infiniband contains the names of all servers in the Hadoop cluster created by the Mammoth Utility. A cluster can span one or more Oracle Big Data Appliance racks.

You can manually create additional files with groups of servers that you want to manage together. For example, you might manage servers 5 to 18 together, because they have no special functions like servers 1 to 4.

1.2.3 Specifying the Commands

You typically specify a command for execution on the target servers on the command line. However, you can also create a command file for a series of commands that you often issue together or for commands with complex syntax. See the -x option in below table.

You can also copy files to the target servers without executing them by using the -f option.

1.2.4 Controlling the Output Levels

You can request more information with the -v option or less information with the -n option. You can also limit the number of returned lines with the --maxlines option, or replace matching strings with the -r option.

Following are examples of various output levels using a simple example: the Linux date command.

Note:

The output from only one server (node07) is shown. The syntax in these examples issues the date command on all 18 servers.

This is the default output, which lists the server followed by the output.

# dcli date
bda1node07-adm.example.com: Tue Feb 14 10:22:31 PST 2012

The minimal output returns OK for completed execution:

# dcli -n date
OK: ['bda1node07.example.com']

Verbose output provides extensive information about the settings under which the command ran:

dcli -v dateoptions.nodes: Noneoptions.destfile: Noneoptions.file: 
Noneoptions.group: dcserversoptions.maxLines: 100000options.listNegatives: 
Falseoptions.pushKey: Falseoptions.regexp: Noneoptions.sshOptions: 
Noneoptions.scpOptions: Noneoptions.dropKey: Falseoptions.serializeOps:
 Falseoptions.userID: rootoptions.verbosity 1options.vmstatOps 
Noneoptions.execfile: Noneargv: ['/opt/oracle/bda/bin/dcli', 
'-g', 'dcservers', '-v', 'date']Success connecting to nodes: 
['bda1node07.example.com']...entering thread for 
bda1node07.example.com:execute: /usr/bin/ssh 
-l root bda1node07.example.com ' date' ...exiting thread for 
bda1node07.example.com status: 0bda1node07.example.com: 
Tue Feb 14 10:24:43 PST 2013]

2 dcli Syntax

dcli [options] [command]

Parameters

options
The options described in below table. You can omit all options to run a command on all servers.

command
Any command that runs from the operating system prompt. If the command contains punctuation marks or special characters, then enclose the command in double quotation marks.

The backslash (\) is the escape character. Precede the following special characters with a backslash on the command line to prevent interpretation by the shell. The backslash is not needed in a command file. See the -x option for information about command files.
$ (dollar sign)
' (quotation mark)
< (less than)
> (greater than)
( ) (parentheses)

Table  dcli Options

Option Description
-c nodes Specifies a comma-separated list of Oracle Big Data Appliance

servers where the command is executed

-C Uses the list of servers in /opt/oracle/bda/cluster-rack-infiniband

as the target.

-d destfile Specifies a target directory or file name for the -f option
-f file Specifies files to be copied to the user’s home directory on the

target servers. The files are not executed. See the -l option.

-g groupfile Specifies a file containing a list of Oracle Big Data Appliance

servers where the command is executed. Either server names

or IP addresses can be used in the file.

-h--help Displays a description of the commands
-k Pushes the ssh key to each server’s /root/.ssh/authorized_keys file.
-l userid Identifies the user ID for logging in to another server.

The default ID is root.

--maxlines=maxlines Identifies the maximum lines of output displayed from a

command executed on multiple servers. The default is 10,000 lines.

-n Abbreviates the output for non-error messages. Only the

server name is displayed when a server returns normal

output (return code 0).You cannot use the

-n and -r options together.

-r regexp Replaces the output with the server name for lines

that match the specified regular expression

-s sshoptions Specifies a string of options that are passed to SSH
--scp=scpoptions Specifies a string of options that are passed to

Secure Copy (SCP), when these options are different from

sshoptions

--serial Serializes execution over the servers. The default is

parallel execution.

-t Lists the target servers
--unkey Drops the keys from the authorized_key files of the target

servers

-v Displays the verbose version of all messages
--version Displays the dcli version number
--vmstat=VMSTATOPS Displays the syntax of the Linux Virtual Memory Statistics utility

(vmstat). This command returns process, virtual memory, disk,

trap, and CPU activity information.To issue a vmstat command,

enclose its options in quotation marks. For example:

--vmstat="-a 3 5"

See your Linux documentation for more information about vmstat.

-x execfile Specifies a command file to be copied to the user’s home directory

and executed on the target servers. See the -l option.

3 dcli Return Values

  • 0: The command ran successfully on all servers.
  • 1: One or more servers were inaccessible or remote execution returned a nonzero value. A message lists the unresponsive servers. Execution continues on the other servers.
  • 2: A local error prevented the command from executing.

If you interrupt the local dcli process, then the remote commands may continue without returning their output or status.

4 dcli Examples

Following are examples of the dcli utility.

This example returns the default list of target servers:

# dcli -t
Target nodes: ['bda1node01-adm.example.com', 'bda1node02-adm.example.com', 
'bda1node03-adm.example.com', 'bda1node04-adm.example.com', 
'bda1node05-adm.example.com', 'bda1node06-adm.example.com', 
'bda1node07-adm.example.com', 'bda1node08-adm.example.com', 
'bda1node09-adm.example.com']

The next example checks the temperature of all servers:

# dcli 'ipmitool sunoem cli "show /SYS/T_AMB" | grep value'

bda1node01-adm.example.com: value = 22.000 degree C
bda1node02-adm.example.com: value = 22.000 degree C
bda1node03-adm.example.com: value = 22.000 degree C
bda1node04-adm.example.com: value = 23.000 degree C