A Beginner’s Guide to NoSQL

October 13, 2014 Leave a comment

What is it going to be? MySQL, MS-SQL, Oracle or PostgreSQL? After all, nothing can be as amazing as a good old RDBMS that employs SQL to manage the data.

Well, allow me to introduce to you an entirely unique and unconventional Database model – NoSQL. Just like every other fine article out there, we too shall begin ith…eh….disclaimers!

NoSQL stands for not-only-SQL. The idea here is not to oppose SQL, but instead provide an alternative in terms of storage of data. Yet, for the obvious reason that most users are well versed with SQL, many NoSQL databases strive to provide an SQLlike query interface.

Why NoSQL?  

That’s a valid question, indeed. Well, here are the reasons:

  • Managing Large Chunks of Data: NoSQL databases can easily handle numerous read/write cycles, several users and amounts of data ranging in petabytes.
  • Schema? Nah, not needed: Most NoSQL databases are devoid of schema and  therefore very flexible. They provide great choices when it comes to constructing a schema and foster easy mapping of objects into them. Terms such as normalization and complex joins are, well, not needed!
  • Programmer-friendly: NoSQL databases provide simple APIs in every major programming language and therefore there    is no need for complex ORM frameworks. And just in case APIs are not available for a particular programming language, data can still be accessed over HTTP via a simple RESTful API, using XML and/or JSON.
  • Availability: Most distributed NoSQL databases provide easy replication of data and failure of one node does not affect the availability of data in a major way.
  • Scalability: NoSQL databases do not require a dedicated high performance server. Actually, they can easily be run on a cluster of commodity hardware and scaling out is just as simple as adding a new node.
  • Low Latency: Unless you are running a cluster of a trillion data servers (or something like that, give or take a few million  of           them),  NoSQL can help you achieve extremely low latency. Of course, latency in itself depends on the amount of data that can be successfully loaded into memory.

Triple stores save data in the form of subject-predicate-object with the predicate being the linking factor between subject and object. As such, Triple Scores too are variants of network databases. For instance, let’s say “Jonny Nitro reads Data Center Magazine.” In this case, Jonny Nitro is the subject, while Data Center Magazine is the object, and the term ‘reads’ acts as the predicate linking the subject with the object. Quite obviously, mapping such semantic queries into SQL will prove difficult, and therefore NoSQL offers a viable alternative. Some of the major implementations of Triple Stores are Sesame, Jena, Virtuoso, AllegroGraph, etc.

SQL ideology 

Basically, NoSQL drops the traditional SQL ideology in favor of CAP Theorem or Brewer’s Theorem, formulated by Eric Brewer in 2000. the theorem talks about three basic principles of Consistency, Availability and Partition Tolerance (abbreviated as CAP), adding that a distributed database can at the most satisfy only two of these. NoSQL databases implement the theorem by employing Eventual Consistency, which is a morerelaxed form of consistency that performs the task over a sufficient period of time. This in turn improves availability and scalability to a great extent.This paradigm is often termed as BASE  – implying Basically Available, Soft state, Eventual Consistency.

NoSQL Data Models

Some of the major and most prominent differentiations among NoSQL databases are as follows:

  1. Document Stores
  2. Hierarchical
  3. Network
  4. Column-oriented
  5. Object-oriented
  6. Key-value Stores
  7. Triple Stores

Document stores

Gone are the days when data organization used to be as minimal as simple rows and columns. Today, data is more often than not represented in the form of XML or JSON (we’re talking about the Web, basically). The reason for favoring XML or JSON     is because both of them are extremelyportable, compact and standardized. Bluntly put, it makes little sense to map XML or JSON documents into a relational model. Instead, a wiser decision would be to utilize the document stores already available. Why? Again, simply because NoSQL databases are schema-less, and there existsno predefined for an XML or JSON document and as a result, each document is independent of the other. The database can be employed in CRM, web-related data, real-time data, etc. Some of the most well known implementation models are MongoDB, CouchDB and RavenDB. In fact, MongoDB has been used by websites such as bit.ly and Sourceforge.

Hierarchical Databases 

These databases store data in the form of hierarchical relevance, that is, tree or parent-child relationship. In terms of relational models, this can be termed as 1:N relationship. Basically, geospatial databases can be used in a hierarchical form to store location information which is essentially hierarchical, though algorithms may vary. Geotagging and geolocation are in vogue of late. It is in such uses that a geospatial database becomes very relevant, and can be used  in Geographical Information System.     Major   examples of the same include PostGIS, Oracle Spatial, etc. Also, some of the most well known implementations of hierarchical databases are the Windows Registry by Microsoft and the IMS Database by IBM.

Graph Network Databases

Graph databases are the most popular form of network database that are used to store data that can be represented in the form of a Graph. Basically, data stored by graph databases can grow exponentially and thus, graph databases are  ideal for storing data that changes frequently. Cutting the theoretical part, graph database has perhaps the most awesome example in the likes of FlockDB, developed by Twitter to implement a graph of who follows whom. FlockDB uses the Gizzard Framework to query a database up to 10,000 times per second. A general technique to query a graph is to begin from an arbitrary or specified start node and follow it by traversing the graph in a depth-first or breadth-first fashion, as per the relationships that obey the given criterion. Most graph databases allow the developer to use simple APIs for accomplishing the task. For instance, you can make queries such as: “Does Jonny Nitro read Data Center Magazine?” Some of the most popular graph databases include, apart from FlockDB, HyperGraphDB and Neo4j.

Column-oriented Databases

Column-oriented databases came into existence after Google’s research paper on its BigTable distributed storage system, which is used internally along with the Google file system. Some of the popular implementations are Hadoop Hbase, Apache Cassandra, HyperTable, etc.

Such databases are implemented more like three-dimensional arrays, the first dimension being the row identifier, the second being a combination of column family plus column identifier and the third being the timestamp. Column-oriented databases are employed by Facebook, Reddit, Digg, etc.

Object-oriented Databases

Whether or not object-oriented databases are purely NoSQL databases is debatable, yet they are more often than not considered to be so because such databases too depart from traditional RDBMS based data models. Such databases allow the storage of data in the form of objects, thereby making it highly transparent. Some of the most popular ones include db4o, NEO, Versant, etc. Object-oriented databases are generally used in research purposes or web-scale production.

Key-value stores

Key-value stores are (arguably) based on Amazon’s Dynamo Research Paper and Distributed hash Tables. Such data models are extremely simplifiedand generally contain only one set of global key value pairs with each value having a unique key associated to it. The database, therefore, is highly scalable and  does not store data relationally. Some popular implementations include Project Voldemort (open-sourced by LinkedIn), Redis, Tokyo Cabinet, etc.

Triple stores

Triple stores save data in the form of subject-predicate-object with the predicate being the linking factor between subject and object. As such, Triple Scores too are variants of network databases. For instance, let’s say “Jonny Nitro reads Data Center Magazine.” In this case, Jonny Nitro is the subject, while Data Center Magazine is the object, and the term ‘reads’ acts as the predicate linking the subject with the object. Quite obviously, mapping such semantic queries into SQL will prove difficult, and therefore NoSQL offers a viable alternative. Some of the major implementations of Triple Stores are Sesame, Jena, Virtuoso, AllegroGraph, etc.

Summary

So, what now? Well, you’ve just been introduced to NoSQL. However, does this mean that you should make the switch to it from SQL? Perhaps. Or perhaps not. The answer varies from situation to situation. If you find SQL queries way too much to cope with, chances are you’ll find NoSQL equally difficult. However, if you’re looking for a more flexible      alternative and do not mind getting your hands dirty, you should definitely give NoSQL a spin! The choice, obviously, is yours! Happy data managing to you!

What NoSQL databases are present today?

  • Cassandra
    • Data Model: Columnfamily
    • Query API: Thrift
  • CouchDB
    • Data Model: Document
    • Query API: map/reduce views
  • HBase
    • Data Model: Columnfamily
    • Query API: Thrift, REST
  • MongoDB
    • Data Model: Document
    • Query API: Cursor
  • Neo4j
    • Data Model: Graph
    • Query API: Graph
  • Redis
    • Data Model: Collection
    • Query API: Collection
  • Riak
    • Data Model: Document
    • Query API: Nested hashes
  • Scalaris
    • Data Model: Key/value
    • Query API: get/put
  • Tokyo Cabinet
    • Data Model: Key/value
    • Query API: get/put
  • Voldemort
    • Data Model: Key/value
    • Query API: get/put
Categories: big data, nosql, sql Tags: ,

Siebel Web Service Call with SOAP

October 13, 2014 Leave a comment

If you’re currently working with Siebel you might be familiar with their SOAP interface. When the software was originally developed, this was in accordance with best practices. However, today REST has become the dominant standard for connectivity on the web, so if you’re consuming Siebel with Ruby, PHP, Python, .NET, you’re probably looking for a REST API. Siebel does support REST, but you must buy Fusion Middleware, a costly addon to your existing platform.

This tutorial will give you basic steps for consuming the SOAP API directly from Node.js. You’ll see that the SOAP connection is fairly straightforward with the help of some libraries. Our demonstration will use the order creation service – which is what you’d use if you were building an ecommerce site that needed to submit orders to Siebel.

By consuming SOAP directly, we can avoid the extra costs of the REST translation, with very little down side.

Siebel API Overview

Siebel’s API is organized by:

Service – This is a high level grouping that covers a whole area of the API. For instance, we’ll be working with the Order Management service.
Port – This is a specific resource managed by the service. For instance, within the Order Management service you will find two ports – Order and OrderItem.
Method – This is the actual process that you will interact with. Among others each PORT will have the familiar CRUD operations.

Termonology: When I say “service” I often mean a single function, which in Siebel/SOAP terminology is a “method”. Sorry in advance for any confusion.

You can find a reference for the Siebel Order Management API here. Scroll to the bottom section for a list of the methods.

What You’ll Need

Siebel instance v7.8 or later

I’m going to assume that if you’re reading this article you’re probably already using Siebel and have an instance available.

Node.js v0.10.0 or later

You’ll need this to run our sample code. Node includes a package manager (npm) which will bring in all the other dependencies we need.

A terminal

If you’re in Windows I’d recommend git-bash.

A text editor

Whatever you’re comfortable with here. My code is written in CoffeeScript with 2 spaces for indentation, so if you’re going to edit that, make sure you’re not mixing tabs in or you’ll get strange syntax errors.

Overview of the Node-Siebel Project

The primary responsibilities we’ve taken on with node-siebel are:

  1. Generate services that can be called programmatically. We make some assumptions about the data formats in an attempt to make your life easier, so this probably won’t work for a non-Siebel API.
  2. Create a REST router for the services and hand it back to you. This is something you can plug in as express/connect/http middleware in node, and start serving your REST endpoints in just a few lines of code.

Let’s walk through the included tests, which should give you a sense of how to utilize this functionality in your own app. Go ahead and open the tests, which you’ll find here.

The first test, “should generate services” shows the services that will be generated from the sample WSDL, which is for the Order service. Each “service” in our world corresponds to a “method” in the SOAP world.

The next test “should create an order” demonstrates calling a service. If you look at the sample data being passed, you’ll notice that it differs from the message schema in the WSDL. Namely, the header elements are included for you, and you are just responsible for specifying the fields in a single document, represented as an object. This is in attempt to be more consistent with what I would expect from a REST interface. You’ll still need to provide the list/header elements for any sub-documents (e.g. OrderItems). Refer to the WSDL for the exact schema, and print the last request if you’re unsure what’s being sent.

NOTE: The Id and OrderNumber fields have a unique constraint, so you’ll need to change the values in order to run this multiple times.

The next test “should list orders” doesn’t really add much that’s new, but it’s not stateful, so you can safely run it many times.

The last test, “should generate REST API” is an integration test. It should be pretty close to the code you’d need to integrate with your own application.

You’ll notice that in addition to returning a set of services, the serviceGenerator also gives you a router. This can be plugged into connect or express, just as any other middleware would be. It has REST routes preconfigured that will forward your requests to the node-soap client, which communicates with the SOAP interface.

The code used to configure the connect middleware and start a server is standard boilerplate, and well documented in the Connect project.

Using Your Own Siebel Services

Assuming your API follows the same conventions as the Siebel standard methods, things will be easy. (Nothing is easy though, so brace yourself.)

Install Node-Siebel

npm install --save node-siebel

Obtain a WSDL

To communicate with other standard Services, or a custom one that you’ve created, you’ll need a WSDL file. This is an XML file that contains a description of where the service is and what it’s capabilities and requirements are.

Let’s go grab one from Administration – Web Services.

  1. Make sure the Siebel instance is running.
  2. Navigate to the Siebel Client in Internet Explorer.
  3. Log in as sadmin.
  4. Go to the Administrative – Web Services tab.
  5. Find the service you want to connect to and select it.
  6. Click the Generate WSDL button and complete the wizard.
  7. Transfer the WSDL file to the machine and project directory where you’ll be coding.

For reference, the Administrative – Web Services screen should look like this:

web services interface

Towards the end of your WSDL file you will find one or more <soap:address> fields. You’ll need to modify the hostnames to match the actual location of your dev server. I found in my tests that I needed to change SecureWebService to WebService, and add &WSSOAP=1 at the end. Here’s some info about the setting.

Connecting

Look at our test helper for an example of how to initialize the Node-Soap library… or follow the node-soap docs.

Option A: Programmatic Access

You can probably just use the node-soap client directly if you just need programmatic access. But if you prefer to not have to include the headers in the document, go ahead and follow the “should create an order” test example.

Option B: REST Access

Follow the example on in the main README. This should have everything you need.

Install

npm install node-siebel

Run Tests

npm install -g mocha
mocha

Tutorial

For more details and background, please see the tutorial.

Create A REST Server

With some minor changes, this should allow you to start a REST server that forwards requests to a node-soap client.

This is from example.js.

// get HTTP modules
var connect = require('connect'),
    http = require('http'),
    request = require('request'),
    server = connect();

// create a node-soap client
var soap = require('soap'),
    join = require('path').join,
    wsdlPath = join(__dirname, "../data/OrderWebService.WSDL"),
    username = 'SADMIN',
    password = 'SADMIN',
    sessionType = 'None';

soap.createClient(wsdlPath, function(err, client) {
    if (err) {
      throw err;
    };

    client.addSoapHeader("<UsernameToken xmlns='http://siebel.com/webservices'>" + username + "</UsernameToken>");
    client.addSoapHeader("<PasswordText xmlns='http://siebel.com/webservices'>" + password + "</PasswordText>");
    client.addSoapHeader("<SessionType xmlns='http://siebel.com/webservices'>" + sessionType + "</SessionType>");

    // create a REST router to forward to the node-soap client
    var serviceGenerator = require('node-siebel'),
        router = serviceGenerator(client).router;

    // connect the router and start the server
    server.use(connect.bodyParser());
    server.use(router);
    http.createServer(server).listen(4000);
});

REST endpoints will be generated based on the WSDL you provide. An example Siebel Order Management WSDL is provided which will generate the following endpoints:

POST   /order/:Id  =>  Order/SynchronizeOrder
DELETE /order/:Id  =>  Order/DeleteOrder
GET    /order/:Id  =>  Order/GetOrderById
PUT    /order/:Id  =>  Order/UpdateOrder
GET    /order      =>  Order/GetOrder
POST   /order      =>  Order/InsertOrder

anything else

POST /order/<action>  =>  Order/<action>

Categories: EAI, Oracle, Siebel Tags: , ,

Using Puppet to Manage Oracle

October 13, 2014 Leave a comment

This module contains custom types that can help you manage DBA objects in an Oracle database. It runs afterthe database is installed. IT DOESN’T INSTALL the Oracle database software. With this module, you can setup a database to receive an application. You can:

  • create a tablespace
  • create a user with the required grants and quota’s
  • create one or more roles
  • create one or more services

Setup

What oracle affects

The types in this module will change settings inside a Oracle Database. No changes are made outside of the database. Take the code from “https://github.com/hajee/oracle&#8221;.

Setup Requirements

To use this module, you need a running Oracle database. I can recommend Edwin Biemonds Puppet OraDb module. The Oracle module itself is based on easy_type.

Beginning with oracle module

After you have a running database, (See Edwin Biemonds Puppet OraDb module), you need to install easy_type, and this module.

1
2
puppet module install hajee/easy_type
puppet module install hajee/oracle

Usage

The module contains the following types:

tablespaceoracle_userrole and listener. Here are a couple of examples on how to use them.

listener

This is the only module that does it’s work outside of the Oracle database. It makes sure the Oracle SQL*Net listener is running.

1
2
3
4
listener {'SID':
  ensure  => running,
  require => Exec['db_install_instance'],
}

The name of the resource MUST be the sid for which you want to start the listener.

Specifying the SID

All types have a name like sid\resource. The sid is optional. If you don’t specify the sid, the type will use the first database instance from the /etc/oratab file. We advise you to use a full name, e.g. an sid and a resource name. This makes the manifest much more resilient for changes in the environment.

oracle_user

This type allows you to manage a user inside an Oracle Database. It recognises most of the options that CREATE USER supports. Besides these options, you can also use this type to manage the grants and the quota’s for this user.

1
2
3
4
5
6
7
8
9
10
oracle_user{sid/user_name:
  temporary_tablespace      => temp,
  default_tablespace        => 'my_app_ts,
  password                  => 'verysecret',
  require                   => Tablespace['my_app_ts'],
  grants                    => ['SELECT ANY TABLE', 'CONNECT', 'CREATE TABLE', 'CREATE TRIGGER'],
  quotas                    => {
                                  "my_app_ts"  => 'unlimited'
                                },
}

tablespace

This type allows you to manage a tablespace inside an Oracle Database. It recognises most of the options that CREATE TABLESPACE supports.

1
2
3
4
5
6
7
8
9
10
11
tablespace {'sid/my_app_ts':
  ensure                    => present,
  datafile                  => 'my_app_ts.dbf',
  size                      => 5G,
  logging                   => yes,
  autoextend                => on,
  next                      => 100M,
  max_size                  => 20G,
  extent_management         => local,
  segment_space_management  => auto,
}

You can also create an undo tablespace:

1
2
3
4
tablespace {'sid/my_undots_1':
  ensure                    => present,
  content                   => 'undo',
}

or a temporary taplespace:

1
2
3
4
5
6
7
8
9
10
11
12
tablespace {'sid/my_temp_ts':
  ensure                    => present,
  datafile                  => 'my_temp_ts.dbf',
  content                   => 'temporary',
  size                      => 5G,
  logging                   => yes,
  autoextend                => on,
  next                      => 100M,
  max_size                  => 20G,
  extent_management         => local,
  segment_space_management  => auto,
}

role

This type allows you to create or delete a role inside an Oracle Database. It recognises a limit part of the options that CREATE ROLE supports.

1
2
3
role {'sid/just_a_role':
  ensure    => present,
}

oracle_service

This type allows you to create or delete a service inside an Oracle Database.

1
2
3
oracle_service{'sid/my_app_service':
  ensure  => present,
}

init_param

this type allows you to manage your init.ora parameters

1
2
3
4
init_param{'sid/parameter/instance':
  ensure  => present,
  value   => 'the_value'
}

asm_diskgroup

This type allows you to manage your ASM diskgroups. Like the other Oracle types, you must specify the SID. But for this type it must be the ASM sid. Most of the times, this is +ASM1

1
2
3
4
5
6
7
8
9
10
11
asm_diskgroup {'+ASM1/REDO':
  ensure          => 'present',
  redundancy_type => 'normal',
  compat_asm      => '11.2.0.0.0',
  compat_rdbms    => '11.2.0.0.0',
  failgroups      => {
    'CONTROLLER1' => { 'diskname' => 'REDOVOL1', 'path' => 'ORCL:REDOVOL1'},
    'CONTROLLER2' => { 'diskname' => 'REDOVOL2', 'path' => 'ORCL:REDOVOL2'},
  }
}

At this point in time the type support just the creation and the removal of a diskgroup. Modification of diskgroups is not (yet) supported.

oracle_exec

this type allows you run a specific SQL statement or an sql file on a specified instance.

1
2
3
4
oracle_exec{"instance/drop table application_users":
  username => 'app_user',
  password => 'password,'
}

This statement will execute the sql statement drop table application_users on the instance names instance. There is no way the type can check if it has already done this statement, so the developer must support this by using puppet if statements.

1
2
3
4
5
oracle_exec{"instance/@/tmp/do_some_stuff.sql":
  username  => 'app_user',
  password  => 'password,'
  logoutput => on_failure,  # can be true, false or on_failure
}

This statement will run the sqlscript /tmp/do_some_stuff.sql on the instance named instance. Like the single statement variant, there is no way to check if the statement is already done. So the developer must check for this himself.

When you don’t specify the username and the password, the type will connect as sysdba.

oracle_thread

This type allows you to enable a thread. Threads are used in Oracle RAC installations. This type might not be very useful for regular use, but it is used in the Oracle RAC module.

1
2
3
oracle_thread{"instance_name/2":
  ensure  => 'enabled',
}

This enables thread 2 on instance named instance_name

Limitations

This module is tested on Oracle 11 on CentOS and Redhat. It will probably work on other Linux distributions. It will definitely not work on Windows. As far as Oracle compatibility. Most of the sql commands’s it creates under the hood are pretty much Oracle version independent. It should work on most Oracle versions.

Development

This is an open projects, and contributions are welcome.

OS support

Currently we have tested:

  • Oracle 11.2.0.2 & 11.2.0.4
  • CentOS 5.8
  • Redhat 5

It would be great if we could get it working and tested on:

  • Oracle 12
  • Debian
  • Windows
  • Ubuntu
  • ….

Oracle version support

Currently we have tested:

  • Oracle 11.2.0.2
  • Oracle 11.2.0.4

It would be great if we could get it working and tested on:

  • Oracle 12

Managable Oracle objects

Obviously Oracle has many many many more DBA objects that need management. For some of these Puppet would be a big help. It would be great if we could get help getting this module to support all of the objects.

If you have knowledge in these technologies, know how to code, and wish to contribute to this project, we would welcome the help.

Testing

Make sure you have:

  • rake
  • bundler

Install the necessary gems:

1
bundle install

And run the tests from the root of the source code:

1
rake test
Categories: Database, Oracle Tags: ,

Restricted access to PL/SQL subprograms in Oracle 12c

October 13, 2014 Leave a comment

Prior to Oracle 12c everyone can refer to a subprogram (helper program) in an other PL/SQL program unit if that user has execute privilege for the helper object or owns it. Now, in Oracle 12c the creator of the helper can determine that
which program units can refer to it, even the other users have execute privilege for the helper objects or they have the EXECUTE ANY PRIVILEGE system privilege.Even the the owner of the helper object is same as the PL/SQL
subprogram’s owner, but if the dependent object is not entitled to use the helper subprogram it can not refer to helper PL/SQL subprogram. The new feature is that the helper PL/SQL subprogram can have
an ACCESSIBLE BY (subprogram1,subprogram2, …) clause where the creator can provide the access to the subprograms listed after the ACCESSIBLE BY keywords.
In the following example HR user who created the tax function provided access of the tax function to the depts procedure (owned by HR) and to depts2 owned by CZINK user.
Note that HR issued the suitable object privilege to czink.
Let’s see the definition of the tax function and the GRANT statement:

CREATE OR REPLACE FUNCTION tax(BASE NUMBER)
RETURN NUMBER
ACCESSIBLE BY (depts,czink.depts2)
IS
S NUMBER;
BEGIN
IF BASE<4000 THEN
  S:= 0.10;
ELSIF BASE<20000 THEN
  S:=0.25;
ELSE
  S:=0.3;
END IF;
RETURN BASE*S;
END;
/
GRANT EXECUTE ON tax TO czink;

Now HR user created a procedure called depts and executed it:

CREATE OR REPLACE
PROCEDURE depts(p_deptno  employees.department_id%TYPE)
AUTHID CURRENT_USER
IS
CURSOR c_emp(c_deptno employees.department_id%TYPE) IS
SELECT e.*, 12*salary*(1+NVL(commission_pct,0)) ANN_SAL
FROM employees e WHERE department_id=c_deptno;
manager employees.last_name%TYPE;
BEGIN
FOR IN c_emp(p_deptno) LOOP
    IF r.manager_id IS NOT NULL THEN
     SELECT last_name INTO MANAGER FROM employees
     WHERE employee_id=r.manager_id;
   ELSE
    manager:='No Manager';
   END IF;
   DBMS_OUTPUT.PUT_LINE(c_emp%ROWCOUNT||'. name:='||r.last_name||' salary:'||r.salary||
   ' tax:'|| tax(r.salary) ||  ' Manager:'||manager||' '||r.manager_id||' Annual Salary:'||r.ann_sal);
END LOOP;
EXCEPTION
 WHEN OTHERS THEN
dbms_output.put_line('The error:'||DBMS_UTILITY.FORMAT_ERROR_STACK );
END;
/
exec depts(90)
anonymous block completed
1. name:=King salary:24000 tax:7200 Manager:No Manager  Annual Salary:288000
2. name:=Kochhar salary:17000 tax:4250 Manager:King 100 Annual Salary:204000
3. name:=De Haan salary:17000 tax:4250 Manager:King 100 Annual Salary:204000

However if HR wants to create a depts2 procedure with the
following code, Oracle produces an error message, because the depts2 procedure WAS NOT ENTITLED to refer to the tax function:

CREATE OR REPLACE
PROCEDURE depts2(p_deptno  employees.department_id%TYPE)
AUTHID CURRENT_USER
IS
CURSOR c_emp(c_deptno employees.department_id%TYPE) IS
SELECT e.*, 12*salary*(1+NVL(commission_pct,0)) ANN_SAL
FROM employees e WHERE department_id=c_deptno;
manager employees.last_name%TYPE;
BEGIN
FOR IN c_emp(p_deptno) LOOP
    IF r.manager_id IS NOT NULL THEN
     SELECT last_name INTO MANAGER FROM employees
     WHERE employee_id=r.manager_id;
   ELSE
    manager:='No Manager';
   END IF;
   DBMS_OUTPUT.PUT_LINE(c_emp%ROWCOUNT||'. name:='||r.last_name||' salary:'||r.salary||
   ' tax:'|| tax(r.salary) ||  ' Manager:'||manager||' '||r.manager_id||' Annual Salary:'||r.ann_sal);
END LOOP;
EXCEPTION
 WHEN OTHERS THEN
dbms_output.put_line('The error:'||DBMS_UTILITY.FORMAT_ERROR_STACK );
END;
/
Error(17,14): PLS-00904: insufficient privilege to access object TAX

Now CZINK user wants to create and execute a depts2 procedure
referring to the tax function owned by HR:

(Supposed that CZINK user has it’s own employees table)

CREATE OR REPLACE
PROCEDURE depts2(p_deptno  employees.department_id%TYPE)
AUTHID CURRENT_USER
IS
CURSOR c_emp(c_deptno employees.department_id%TYPE) IS
SELECT e.*, 12*salary*(1+NVL(commission_pct,0)) ANN_SAL
FROM employees e WHERE department_id=c_deptno;
manager employees.last_name%TYPE;
BEGIN
FOR IN c_emp(p_deptno) LOOP
IF r.manager_id IS NOT NULL THEN
SELECT last_name INTO MANAGER FROM employees
WHERE employee_id=r.manager_id;
ELSE
manager:='No Manager';
END IF;
DBMS_OUTPUT.PUT_LINE(c_emp%ROWCOUNT||'. name:='||r.last_name||' salary:'||r.salary||
' tax:'|| HR.tax(r.salary) ||  ' Manager:'||manager||' '||r.manager_id||' Annual Salary:'||r.ann_sal);
END LOOP;
EXCEPTION
WHEN OTHERS THEN
dbms_output.put_line('The error:'||DBMS_UTILITY.FORMAT_ERROR_STACK );
END;
/
exec depts2(90)
anonymous block completed
1. name:=King salary:24000 tax:7200 Manager:No Manager  Annual Salary:288000
2. name:=Kochhar salary:17000 tax:4250 Manager:King 100 Annual Salary:204000
3. name:=De Haan salary:17000 tax:4250 Manager:King 100 Annual Salary:204000

Of course, if CZINK user created a procedure
(referring to HR’s tax function) with different name than depts2
then CZINK user would get the same error message.
(The DBA role was assigned to the CZINK user in my example)

Comment and benefits of using the ACCESSIBLE BY clause:

1. You can provide access to a helper PL/SQL programs only for those
PL/SQL subprograms which are really need to refer to them.
2. The restriction made for PL/SQL subprograms not for users.
3. Even if a user has a DBA role or “just” the
EXECUTE ANY PROCEDURE the user won’t be able to use the helper
PL/SQL subprogram, unless it(the helper program) allows
to access directly to the invoker program.
4. You can specify the ACCESSIBLE BY clause on package level
(not for individual members), like this:

CREATE OR REPLACE PACKAGE taxes ACCESSIBLE BY (depts,czink.depts2)
IS
FUNCTION tax1(BASE NUMBER) RETURN NUMBER;
FUNCTION tax2(BASE NUMBER) RETURN NUMBER;
END taxes;
/
CREATE OR REPLACE PACKAGE BODY taxes
IS
FUNCTION tax1(BASE NUMBER) RETURN NUMBER
IS
S NUMBER;
BEGIN
IF    BASE<4000 THEN   S:= 0.10;
ELSIF BASE<20000 THEN   S:=0.25;
ELSE  S:=0.3;
END IF;
RETURN BASE*S;
END tax1;
FUNCTION tax2(BASE NUMBER) RETURN NUMBER
IS
S NUMBER;
BEGIN
IF BASE<4000 THEN   S:= 0.10;
ELSE   S:=0.3;
END IF;
RETURN BASE*S;
END tax2;
END taxes;
/

New AWR Report Format: Oracle 11.2.0.4 and 12c

October 12, 2014 Leave a comment

statistics-and-graphs

This is a post about Oracle Automatic Workload Repository (AWR) Reports. If you are an Oracle professional you doubtless know what these are – and if you have to perform any sort of performance tuning as part of your day job it’s likely you spend a lot of time immersed in them. For anyone else, they are (huge) reports containing all sorts of information about activities that happened between two points of time on an Oracle instance. If that doesn’t excite you now, please move along – there is nothing further for you here.

truthAWR Reports have been with us since the introduction of the Automatic Workload Repository back in 10g and can be considered a replacement for the venerable Statspack tool. Through each major incremental release the amount of information contained in an AWR Report has grown; for instance, the 10g reports didn’t even show the type of operating system, but 11g reports do. More information is of course a good thing, but sometimes it feels like there is so much data now it’s hard to find the truth hidden among all the distractions.

The 11.2.0.4 New Features document doesn’t mention anything about a new report format. I can’t find anything about it on My Oracle Support (but then I can never find anything about anything I’m looking for on MOS these days). So I’m taking it upon myself to document the new format and the changes introduced – as well as point out a nasty little issue that’s caught me out a couple of times already.

Comparing Old and New Formats

From what I can tell, all of the major changes except one have taken place in the Report Summary section at the start of the AWR report. Oracle appears to have re-ordered the subsections and added a couple of new ones:

  • Wait Classes by Total Wait Time
  • IO Profile

The new Wait Classes section is interesting because there is already a section called Foreground Wait Class down in the Wait Event Statistics section of the Main Report, but the additional section appears to include background waits as well. The IO Profile section is especially useful for people like me who work with storage.

In addition, the long-serving Top 5 Timed Foreground Events section has been renamed and extended to become Top 10 Foreground Events by Total Wait Time.

Here are the changes in tabular format:

Old Format

New Format

Cache Sizes

Load Profile

Instance Efficiency Percentages

Shared Pool Statistics

Top 5 Timed Foreground Events

Host CPU

Instance CPU

Memory Statistics

-

-

Time Model Statistics

Load Profile

Instance Efficiency Percentages

Top 10 Foreground Events by Total Wait Time

Wait Classes by Total Wait Time

Host CPU

Instance CPU

IO Profile

Memory Statistics

Cache Sizes

Shared Pool Statistics

Time Model Statistics

I also said there was one further change outside of the Report Summary section. It’s the long-standing Instance Activity Stats section, which has now been divided into two:

Old Format

New Format

Instance Activity Stats

-

Key Instance Activity Stats

Other Instance Activity Stats

I don’t really understand the point of that change, nor why a select few statistics are deemed to be more “key” than others. But hey, that’s the mystery of Oracle, right?

Tablespace / Filesystem IO Stats

Another, more minor change, is the addition of some cryptic-looking “1-bk” columns to the two sections Tablespace IO Stats and File IO Stats:

Tablespace
------------------------------
          Av       Av     Av      1-bk  Av 1-bk          Writes   Buffer  Av Buf
  Reads   Rds/s  Rd(ms) Blks/Rd   Rds/s  Rd(ms)  Writes   avg/s    Waits  Wt(ms)
------- ------- ------- ------- ------- ------- ------- ------- -------- -------
UNDOTBS1
8.4E+05      29     0.7     1.0 6.3E+06    29.2       1     220    1,054     4.2
SYSAUX
 95,054       3     0.8     1.0  11,893     3.3       1       0        1    60.0
SYSTEM
    745       0     0.0     1.0   1,055     0.0       0       0       13     0.8
USERS
    715       0     0.0     1.0     715     0.0       0       0        0     0.0
TEMP
      0       0     0.0     N/A       7     0.0       0       0        0     0.0

I have to confess it took me a while to figure out what they meant – in the end I had to consult the documentation for the view DBA_HIST_FILESTATXS:

Column Datatype NULL Description
SINGLEBLKRDS NUMBER Number of single block reads
SINGLEBLKRDTIM NUMBER Cumulative single block read time (in hundredths of a second)

Aha! So the AWR report is now giving us the number of single block reads (SINGLEBLKRDS) and the average read time for them (SINGLEBLKRDTIM / SINGLEBLKRDS). That’s actually pretty useful information for testing storage, since single block reads tell no lies. [If you want to know what I mean by that, visit Frits Hoogland's blog and download his white paper on multiblock reads...]

Top 10: Don’t Believe The Stats

One thing you might want to be wary about is the new Top 10 section… Here are the first two lines from mine after running a SLOB test:

Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                            Tota    Wait   % DB
Event                                 Waits Time Avg(ms)   time Wait Class
------------------------------ ------------ ---- ------- ------ ----------
db file sequential read        3.379077E+09 2527       1   91.4 User I/O
DB CPU                                      318.           11.5

Now, normally when I run SLOB and inspect the post-run awr.txt file I work out the average wait time for db file sequential read so I can work out the latency. Since AWR reports do not have enough decimal places for the sort of storage I use (the wait shows simply as 0 or 1), I have to divide the total wait time by the number of waits. But in the report above, the total wait time of 2,527 divided by 3,379,077,000 waits gives me an average of 0.000747 microseconds. Huh? Looking back at the numbers above it’s clear that the Total Time column has been truncated and some of the digits are missing. That’s bad news for me, as I regularly use scripts to strip this information out and parse it.

This is pretty poor in my opinion, because there is no warning and the number is just wrong. I assume this is an edge case because the number of waits contains so many digits, but for extended SLOB tests that’s not unlikely. Back in the good old Top 5 days it looked like this, which worked fine:

Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                              Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
db file sequential read         119,835,425      50,938      0   84.0 User I/O
latch: cache buffers lru chain   20,051,266       6,221      0   10.3 Other

Unfortunately, in the new 11.2.0.4 and above Top 10 report, the Total Time column simply isn’t wide enough. Instead, I have to scan down to the Foreground Wait Events section to get my true data:

                                                             Avg
                                        %Time Total Wait    wait    Waits   % DB
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
db file sequential read    3.379077E+09     0  2,527,552       1     11.3   91.4

This is something worth looking out for, especially if you also use scripts to fetch data from AWR files. Of course, the HTML reports don’t suffer from this problem, which just makes it even more annoying as I can’t parse HTML reports automatically (and thus I despise them immensely).

12.1.0.2 AWR Reports

One final thing to mention is the AWR report format of 12.1.0.2 (which was just released at the time of writing). There aren’t many changes from 12.1.0.1 but just a few extra lines have crept in, which I’ll highlight here. In the main, they are related to the new In Memory Database option.

Load Profile                    Per Second   Per Transaction  Per Exec  Per Call
~~~~~~~~~~~~~~~            ---------------   --------------- --------- ---------
             DB Time(s):               0.3               4.8      0.00      0.02
              DB CPU(s):               0.1               1.2      0.00      0.00
      Background CPU(s):               0.0               0.1      0.00      0.00
      Redo size (bytes):          50,171.6         971,322.7
  Logical read (blocks):             558.6          10,814.3
          Block changes:             152.2           2,947.0
 Physical read (blocks):              15.1             292.0
Physical write (blocks):               0.2               4.7
       Read IO requests:              15.1             292.0
      Write IO requests:               0.2               3.3
           Read IO (MB):               0.1               2.3
          Write IO (MB):               0.0               0.0
           IM scan rows:               0.0               0.0
Session Logical Read IM:
             User calls:              16.1             312.0
           Parses (SQL):              34.0             658.0
      Hard parses (SQL):               4.6              88.0
     SQL Work Area (MB):               0.9              17.2
                 Logons:               0.1               1.7
         Executes (SQL):              95.4           1,846.0
              Rollbacks:               0.0               0.0
           Transactions:               0.1

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:   97.55       Redo NoWait %:  100.00
            Buffer  Hit   %:   97.30    In-memory Sort %:  100.00
            Library Hit   %:   81.75        Soft Parse %:   86.63
         Execute to Parse %:   64.36         Latch Hit %:   96.54
Parse CPU to Parse Elapsd %:   19.45     % Non-Parse CPU:   31.02
          Flash Cache Hit %:    0.00

<snip!>

Cache Sizes                       Begin        End
~~~~~~~~~~~                  ---------- ----------
               Buffer Cache:       960M       960M  Std Block Size:         8K
           Shared Pool Size:     4,096M     4,096M      Log Buffer:   139,980K
             In-Memory Area:         0M         0M

One other thing of note is that the Top 10 section now (finally) displays average wait times to two decimal places. This took a surprising amount of time to arrive, but it’s most welcome:

Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Wait   % DB Wait
Event                                Waits Time (sec)    Avg(ms)   time Class
------------------------------ ----------- ---------- ---------- ------ --------
db file parallel read               63,157      828.6      13.12   86.1 User I/O
DB CPU                                          234.2              24.3
db file sequential read            113,786       67.8       0.60    7.0 User I/O
Categories: 12c, Oracle Tags: , , ,
Follow

Get every new post delivered to your Inbox.

Join 529 other followers