Producing RSS from PL/SQL

First things first. The idea and the bulk of the code for this post are not mine, they are Sean Dillon’s. It’s a very cool idea that he came up with and it still works flawlessly

I needed to incorporate RSS into a project I’m working, so I grabbed Sean’s code. The problem is, it’s based on the AskTom table structure which means it won’t run on your database without immediately re-writing the query. Additionally, the nature of this code relies on a pretty lengthy query to generate the XML. I’ll admit, when I first looked at it, I thought “Wow, this is going to be more complex than I thought.” After looking at it for a little while longer, I realized it was actually very simple. Sean also included support for several versions of RSS, improving the functionality, but again, adding to the complexity.

So, I created an example table and simplified the code as much as possible to make it easier for everyone to understand. The table, “PLSQL_PACKAGES”, stores information about some of the built-in PL/SQL packages I use on a regular basis. The links in this table point back to the online Oracle Documentation.

This block of code is just the DDL for the sample table and the insert statements to populate it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
create table plsql_packages(
    id          varchar2(32),
    title       varchar2(255),
    description varchar2(4000),
    link        varchar2(1000),
    updated_by  varchar2(100),
    updated_on  date)
/
create or replace trigger  biu_plsql_packages before insert or
update on plsql_packages
for each row
begin
    if inserting then
        :new.id := sys_guid();
    end if;
        :new.updated_by := nvl(v('APP_USER'),user);
        :new.updated_on := sysdate;
end;
/
insert into plsql_packages(title,description,link)
     values ('DBMS_CRYPTO','DBMS_CRYPTO provides an interface
to encrypt and decrypt stored data, and can be used in
conjunction with PL/SQL programs running network
communications. It provides support for several
industry-standard encryption and hashing algorithms,
including the Advanced Encryption Standard (AES)
encryption algorithm. AES has been approved by the National
Institute of Standards and Technology (NIST) to replace
/
insert into plsql_packages(title,description,link)
     values ('DBMS_EPG','The DBMS_EPG package implements
the embedded PL/SQL gateway that enables a web browser to
invoke a PL/SQL stored procedure through an HTTP listener.','http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/d_epg.htm#sthref3481')
/
insert into plsql_packages(title,description,link)
     values (' OWA_UTIL','The OWA_UTIL package contains
utility subprograms for performing operations such as
getting the value of CGI environment variables, printing
the data that is returned to the client, and printing the
/
insert into plsql_packages(title,description,link)
     values ('UTL_MAIL','The UTL_MAIL package is a utility
for managing email which includes commonly used email
features, such as attachments, CC, BCC, and return receipt.','http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/u_mail.htm#i1001258')
/

This is the important block of code as it creates the RSS procedure. Don’t be intimidated by it though as you only need to modify a few lines. The only lines you need to customize to make it work against your own table are 4-7 and 41-43!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
create or replace procedure rss
is
    -- customizable parameters
    l_title         varchar2(255) := 'Oracle PL/SQL Packages';
    l_link          varchar2(255) := 'http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/toc.htm';
    l_description   varchar2(255) := 'This is a feed of changes to PL/SQL Package Documentation';
    l_language      varchar2(255) := 'en-us';
    -- end customizable parameters
    l_version       varchar2(10)  := '2.0';
    l_clob          clob;
    l_idx           pls_integer := 1;
    l_len           pls_integer := 255;
    l_defrows       pls_integer := 10;
    l_maxrows       pls_integer := 30;
    l_desclen       pls_integer := 250;
begin
    for i in (
      select xmlelement( "rss",
               -- Begin XML Header Block
               xmlattributes( l_version as "version"),
                 xmlelement( "channel",
                   xmlforest( l_title as "title",
                              l_link as "link",
                              l_description as "description",
                              l_language as "language"),
                 -- End XML Header Block
                 -- Begin List of Individual Articles or
                 -- Items
                 xmlagg(
                     xmlelement( "item",
                       xmlelement("title", x.title),
                       xmlelement("link", x.link),
                       xmlelement("description",
                       x.description),
                       xmlelement("pubDate",
                       to_char(x.updated_on,'Dy, DD Mon RRRR hh24:mi:ss')),
                       xmlelement("guid", XMLATTRIBUTES
                       ('false' as "isPermaLink"),x.id||
                       to_char(x.updated_on,'JHH24MISS'))
                     )
                   )
                   -- End List of Individual Articles or
Items
                 )
             ) as result
        from -- Actual Database Query that
--populates the list of Items
                select id,title,link,description,
                updated_on
                  from plsql_packages
                 where rownum < (l_maxrows+1)) x)
    loop
        l_clob := xmltype.extract(i.result,'/').getclobval;
        exit;
    end loop; --i
    --- OUTPUT RESULTS
    owa_util.mime_header('application/xml', false);
    owa_util.http_header_close;
    for i in 1..ceil(dbms_lob.getlength(l_clob)/l_len) loop
        htp.prn(substr(l_clob,l_idx,l_len));
        l_idx := l_idx + l_len;
    end loop; --i
end rss;
/

Note on line 34 the “guid” element. This is an optional element (documented here) that an aggregator can use to uniquely identify the item. I’m concatenating the ID column from the table with Julian date concatenated with hours, minutes, and seconds – to_char(sysdate,‘JHH24MISS’). This means that when you update a row, the date will change causing your aggregator to see a new guid and display a new item for the changed row.The easiest way to test this procedure is using the “OWA Output” tab in SQL Developer:SQL Developer OWA Ouput

If you’re running XE or 11g and you want to call this procedure directly through the APEX DAD, you’ll need to edit the FLOWS_XXXXXX.wwv_flow_epg_include_mod_local function and comment-out the first line as well as your procedure to the IN list.

 

 

Advertisements

A Beginner’s Guide to NoSQL

What is it going to be? MySQL, MS-SQL, Oracle or PostgreSQL? After all, nothing can be as amazing as a good old RDBMS that employs SQL to manage the data.

Well, allow me to introduce to you an entirely unique and unconventional Database model – NoSQL. Just like every other fine article out there, we too shall begin ith…eh….disclaimers!

NoSQL stands for not-only-SQL. The idea here is not to oppose SQL, but instead provide an alternative in terms of storage of data. Yet, for the obvious reason that most users are well versed with SQL, many NoSQL databases strive to provide an SQLlike query interface.

Why NoSQL?  

That’s a valid question, indeed. Well, here are the reasons:

  • Managing Large Chunks of Data: NoSQL databases can easily handle numerous read/write cycles, several users and amounts of data ranging in petabytes.
  • Schema? Nah, not needed: Most NoSQL databases are devoid of schema and  therefore very flexible. They provide great choices when it comes to constructing a schema and foster easy mapping of objects into them. Terms such as normalization and complex joins are, well, not needed!
  • Programmer-friendly: NoSQL databases provide simple APIs in every major programming language and therefore there    is no need for complex ORM frameworks. And just in case APIs are not available for a particular programming language, data can still be accessed over HTTP via a simple RESTful API, using XML and/or JSON.
  • Availability: Most distributed NoSQL databases provide easy replication of data and failure of one node does not affect the availability of data in a major way.
  • Scalability: NoSQL databases do not require a dedicated high performance server. Actually, they can easily be run on a cluster of commodity hardware and scaling out is just as simple as adding a new node.
  • Low Latency: Unless you are running a cluster of a trillion data servers (or something like that, give or take a few million  of           them),  NoSQL can help you achieve extremely low latency. Of course, latency in itself depends on the amount of data that can be successfully loaded into memory.

Triple stores save data in the form of subject-predicate-object with the predicate being the linking factor between subject and object. As such, Triple Scores too are variants of network databases. For instance, let’s say “Jonny Nitro reads Data Center Magazine.” In this case, Jonny Nitro is the subject, while Data Center Magazine is the object, and the term ‘reads’ acts as the predicate linking the subject with the object. Quite obviously, mapping such semantic queries into SQL will prove difficult, and therefore NoSQL offers a viable alternative. Some of the major implementations of Triple Stores are Sesame, Jena, Virtuoso, AllegroGraph, etc.

SQL ideology 

Basically, NoSQL drops the traditional SQL ideology in favor of CAP Theorem or Brewer’s Theorem, formulated by Eric Brewer in 2000. the theorem talks about three basic principles of Consistency, Availability and Partition Tolerance (abbreviated as CAP), adding that a distributed database can at the most satisfy only two of these. NoSQL databases implement the theorem by employing Eventual Consistency, which is a morerelaxed form of consistency that performs the task over a sufficient period of time. This in turn improves availability and scalability to a great extent.This paradigm is often termed as BASE  – implying Basically Available, Soft state, Eventual Consistency.

NoSQL Data Models

Some of the major and most prominent differentiations among NoSQL databases are as follows:

  1. Document Stores
  2. Hierarchical
  3. Network
  4. Column-oriented
  5. Object-oriented
  6. Key-value Stores
  7. Triple Stores

Document stores

Gone are the days when data organization used to be as minimal as simple rows and columns. Today, data is more often than not represented in the form of XML or JSON (we’re talking about the Web, basically). The reason for favoring XML or JSON     is because both of them are extremelyportable, compact and standardized. Bluntly put, it makes little sense to map XML or JSON documents into a relational model. Instead, a wiser decision would be to utilize the document stores already available. Why? Again, simply because NoSQL databases are schema-less, and there existsno predefined for an XML or JSON document and as a result, each document is independent of the other. The database can be employed in CRM, web-related data, real-time data, etc. Some of the most well known implementation models are MongoDB, CouchDB and RavenDB. In fact, MongoDB has been used by websites such as bit.ly and Sourceforge.

Hierarchical Databases 

These databases store data in the form of hierarchical relevance, that is, tree or parent-child relationship. In terms of relational models, this can be termed as 1:N relationship. Basically, geospatial databases can be used in a hierarchical form to store location information which is essentially hierarchical, though algorithms may vary. Geotagging and geolocation are in vogue of late. It is in such uses that a geospatial database becomes very relevant, and can be used  in Geographical Information System.     Major   examples of the same include PostGIS, Oracle Spatial, etc. Also, some of the most well known implementations of hierarchical databases are the Windows Registry by Microsoft and the IMS Database by IBM.

Graph Network Databases

Graph databases are the most popular form of network database that are used to store data that can be represented in the form of a Graph. Basically, data stored by graph databases can grow exponentially and thus, graph databases are  ideal for storing data that changes frequently. Cutting the theoretical part, graph database has perhaps the most awesome example in the likes of FlockDB, developed by Twitter to implement a graph of who follows whom. FlockDB uses the Gizzard Framework to query a database up to 10,000 times per second. A general technique to query a graph is to begin from an arbitrary or specified start node and follow it by traversing the graph in a depth-first or breadth-first fashion, as per the relationships that obey the given criterion. Most graph databases allow the developer to use simple APIs for accomplishing the task. For instance, you can make queries such as: “Does Jonny Nitro read Data Center Magazine?” Some of the most popular graph databases include, apart from FlockDB, HyperGraphDB and Neo4j.

Column-oriented Databases

Column-oriented databases came into existence after Google’s research paper on its BigTable distributed storage system, which is used internally along with the Google file system. Some of the popular implementations are Hadoop Hbase, Apache Cassandra, HyperTable, etc.

Such databases are implemented more like three-dimensional arrays, the first dimension being the row identifier, the second being a combination of column family plus column identifier and the third being the timestamp. Column-oriented databases are employed by Facebook, Reddit, Digg, etc.

Object-oriented Databases

Whether or not object-oriented databases are purely NoSQL databases is debatable, yet they are more often than not considered to be so because such databases too depart from traditional RDBMS based data models. Such databases allow the storage of data in the form of objects, thereby making it highly transparent. Some of the most popular ones include db4o, NEO, Versant, etc. Object-oriented databases are generally used in research purposes or web-scale production.

Key-value stores

Key-value stores are (arguably) based on Amazon’s Dynamo Research Paper and Distributed hash Tables. Such data models are extremely simplifiedand generally contain only one set of global key value pairs with each value having a unique key associated to it. The database, therefore, is highly scalable and  does not store data relationally. Some popular implementations include Project Voldemort (open-sourced by LinkedIn), Redis, Tokyo Cabinet, etc.

Triple stores

Triple stores save data in the form of subject-predicate-object with the predicate being the linking factor between subject and object. As such, Triple Scores too are variants of network databases. For instance, let’s say “Jonny Nitro reads Data Center Magazine.” In this case, Jonny Nitro is the subject, while Data Center Magazine is the object, and the term ‘reads’ acts as the predicate linking the subject with the object. Quite obviously, mapping such semantic queries into SQL will prove difficult, and therefore NoSQL offers a viable alternative. Some of the major implementations of Triple Stores are Sesame, Jena, Virtuoso, AllegroGraph, etc.

Summary

So, what now? Well, you’ve just been introduced to NoSQL. However, does this mean that you should make the switch to it from SQL? Perhaps. Or perhaps not. The answer varies from situation to situation. If you find SQL queries way too much to cope with, chances are you’ll find NoSQL equally difficult. However, if you’re looking for a more flexible      alternative and do not mind getting your hands dirty, you should definitely give NoSQL a spin! The choice, obviously, is yours! Happy data managing to you!

What NoSQL databases are present today?

  • Cassandra
    • Data Model: Columnfamily
    • Query API: Thrift
  • CouchDB
    • Data Model: Document
    • Query API: map/reduce views
  • HBase
    • Data Model: Columnfamily
    • Query API: Thrift, REST
  • MongoDB
    • Data Model: Document
    • Query API: Cursor
  • Neo4j
    • Data Model: Graph
    • Query API: Graph
  • Redis
    • Data Model: Collection
    • Query API: Collection
  • Riak
    • Data Model: Document
    • Query API: Nested hashes
  • Scalaris
    • Data Model: Key/value
    • Query API: get/put
  • Tokyo Cabinet
    • Data Model: Key/value
    • Query API: get/put
  • Voldemort
    • Data Model: Key/value
    • Query API: get/put