How Oracle Locking Works

When a transaction updates a row, it puts a lock so that no one can update the same row until it commits. When another transaction issues an update to the same row, it waits until the first one either commits or rolls back. After the first transaction performs a commit or rollback, the update by the second transaction is executed immediately, since the lock placed by the first transaction is now gone. How exactly does this locking mechanism work? Several questions come to mind in this context:

  1. Is there some kind of logical or physical structure called lock?
  2. How does the second transaction know when the first transaction has lifted the lock?
  3. Is there some kind of “pool” of such locks where transactions line up to get one?
  4. If so, do they line up to return it when they are done with the locking?
  5. Is there a maximum number of possible locks?
  6. Is there something called a block level lock? Since Oracle stores the rows in blocks, when all or the majority of rows in the blocks are locked by a single transaction, doesn’t it make sense for to lock the entire block to conserve the number of locks?
  7. The previous question brings up another question – does the number of active locks in the database at any point really matter?

If you are interested to learn about all this, please read on.

Lock Manager

Since locks convey information on who has what rows modified but not committed, anyone interested in making the update much check with some sort of system that is available across the entire database. So, it makes perfect sense to have a central locking system in the database, doesn’t it? But, when you think about it, a central lock manager can quickly become a single point of contention in a busy system where a lot of updates occur. Also, when a large number of rows are updated in a single transaction, an equally large number of locks will be required as well. The question is: how many? One can guess; but it will be at best a wild one. What if you guessed on the low side and the supply of available locks is depleted? In that case some transactions can’t get locks and therefore will have to wait (or, worse, abort). Not a pleasant thought in a system that needs to be scalable. To counter such a travesty you may want to make the available supply of locks really high. What is the downside of that action? Since each lock would potentially consume some memory, and memory is finite, it would not be advisable to create an infinite supply of locks.

Some databases actually have a lock manager with a finite supply of such locks. Each transaction must ask to get a lock from it before beginning and relinquish locks to it at the completion. In those technologies, the scalability of application suffers immensely as a result of the lock manager being the point of contention. In addition, since the supply of locks is limited, the developers need to commit frequently to release the locks for other transactions. When a large number of rows have locks on them, the database replaces the row locks with a block level lock to cover all the rows in the block – a concept known as lock escalation. Oracle does not follow that approach. In Oracle, there no central lock manager, no finite limit on locks and there is no such concept called lock escalation. The developers commit only when there is a logical need to do so; not otherwise.

Lock Management in Oracle

So, how is that approach different in case of Oracle? For starters, there is no central lock manager. But the information on locking has to be recorded somewhere. Where then? Well, consider this: when a row is locked, it must be available to the session, which means the session’s server process must have already accessed and placed the block in the buffer cache prior to the transaction occurring. Therefore, what is a better place for putting this information than right there in the block (actually the buffer in the buffer cache) itself?

Oracle does precisely that – it records the information in the block. When a row is locked by a transaction, that nugget of information is placed in the header of the block where the row is located. When another transaction wishes to acquire the lock on the same row, it has to access the block containing the row anyway (as you learned in Part 1 of this series) and upon reaching the block, it can easily confirm that the row is locked from the block header. A transaction looking to update a row in a different block puts that information on the header of that block. There is no need to queue behind some single central resource like a lock manager. Since lock information is spread over multiple blocks instead of a single place, this mechanism makes transactions immensely scalable.

Being the smart reader you are, you are now hopefully excited to learn more or perhaps you are skeptical. You want to know the nuts and bolts of this whole mechanism and, more, you want proof. We will see all that in a moment.

Transaction Address

Before understanding the locks, you should understand clearly what a transaction is and how it is addressed. A transaction starts when an update to data such as insert, update or delete occurs (or the intention to do so, e.g. SELECT FOR UPDATE) and ends when the session issues a commit or rollback. Like everything else, a specific transaction should have a name or an identifier to differentiate it from another one of the same type. Each transaction is given a transaction ID. When a transaction updates a row (it could also insert a new row or delete an existing one; but we will cover that little later in this article), it records two things:

  • The new value
  • The old value

The old value is recorded in the undo segments while the new value is immediately updated in the buffer where the row is stored. The data buffer containing the row is updated regardless of whether the transaction is committed or not. Yes, let me repeat – the data buffer is updated as soon as the transaction modifies the row (before commit).

Undo information is recorded in a circular fashion. When new undo is created, it is stored in the next available undo “slot”. Each transaction occupies a record in the slot. After all the slots are exhausted and a new transaction arrives, the next processing depends on the state of the transactions. If the oldest transaction occupying any of the other slots is no longer active (that is either committed or rolled back), Oracle will reuse that slot. If none of the transactions is inactive, Oracle will have to expand the undo tablespace to make room. In the former case (where a transaction is no longer active and its information in undo has been erased by a new transaction), if a long running query that started before the transaction occurred selects the value, it will get an ORA-1555 error. But that will be covered in a different article in the future. If the tablespace containing the undo segment can’t extend due to some reason (such as in case of the filesystem being completely full), the transaction will fail.

Speaking of transaction identifiers, it is in the form of three numbers separated by periods. These three numbers are:

  • Undo Segment Number where the transaction records its undo entry
  • Slot# in the undo segment
  • Sequence# (or wrap) in the undo slot

This is sort of like the social security number of the transaction. This information is recorded in the block header. Let’s see the proof now through a demo.

Demo

First, create a table:

SQL> create table itltest (col1 number, col2 char(8));

Insert some rows into the table.

SQL> begin
  2     for i in 1..10000 loop 
  3             insert into itltest values (i,'x');
  4     end loop;
  5     commit;
  6  end;
  7  /

Remember, this is a single transaction. It started at the “BEGIN” line and ended at “COMMIT”. The 10,000 rows were all inserted as a part of the same transaction. To know the transaction ID of this transaction, Oracle provides a special package – dbms_transaction. Here is how you use it. Remember, you must use it in the same transaction. Let’s see:

SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
------------------------------------------------------------------------


1 row selected.

Wait? There is nothing. The transaction ID returned is null. How come?

If you followed the previous section closely, you will realize that the transaction ends when a commit or rollback is issued. The commit was issued inside the PL/SQL block. So, the transaction had ended before you called the dbms_transaction is package. Since there was no transaction, the package returned null.

Let’s see another demo. Update one row:

SQL> update itltest set col2 = 'y' where col1 = 1;

1 row updated.

In the same session, check the transaction ID:

SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
-------------------------------------------------------------------------
3.23.40484

1 row selected.

There you see – the transaction ID. The three numbers separated by period signify undo segment number, slot# and record# respectively. Now perform a commit:

SQL> commit;

Commit complete.

Check the transaction ID again:

SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
-------------------------------------------------------------------------


1 row selected.

The transaction is gone so the ID is null, as expected.

Since the call to the package must be in the same transaction (and therefore in the same session), how can you check the transaction in a different session? In real life you will be asked to check transaction in other sessions, typically application sessions. Let’s do a slightly different test. Update the row one more time and check the transaction:

SQL> update itltest set col2 = 'y' where col1 = 1;

1 row updated.

SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
-----------------------------------------------------------------------
10.25.31749

1 row selected.

From a different session, check for active transactions. This information is available in the view V$TRANSACTION. There are several columns; but we will look at four of the most important ones:

  • ADDR – the address of the transaction, which is a raw value
  • XIDUSN – the undo segment number
  • XIDSLOT – the slot#
  • XIDSQN – the sequence# or record# inside the slot
SQL> select addr, xidusn, xidslot, xidsqn
  2  from v$transaction;

ADDR         XIDUSN    XIDSLOT     XIDSQN
-------- ---------- ---------- ----------
3F063C48         10         25      31749

Voila! You see the transaction id of the active transaction from a different session. Compare the above output to the one you got from the call to dbms_transaction package. You can see that the transaction identifier shows the same set of numbers.

Interested Transaction List

You must be eager to know about the section of the block header that contains information on locking and how it records it. It is a simple data structure called “Interested Transaction List” (ITL), a list that maintains information on transaction. The ITL contains several placeholders (or slots) for transactions. When a row in the block is locked for the first time, the transaction places a lock in one of the slots. In other words, the transaction makes it known that it is interested in some rows (hence the term “Interested Transaction List”). When a different transaction locks another set of rows in the same block, that information is stored in another slot and so on. When a transaction ends after a commit or a rollback, the locks are released and the slot which was used to mark the row locks in the block is now considered free (although it is not updated immediately – fact about which you will learn later in a different installment).

ITLs in Action

Let’s see how ITLs really work. Here is an empty block. The block header is the only occupant of the block.

This is how the block looks like after a single row has been inserted:

Note, the row was inserted from the bottom of the block. Now, a second row has been inserted:

A session comes in and updates the row Record1, i.e. it places a lock on the row, shown by the star symbol. The lock information is recorded in the ITL slot in the block header:

The session does not commit yet; so the lock is active. Now a second session – Session 2 – comes in and updates row Record2. It puts a lock on the record – as stored in the ITL slot.

I have used two different colors to show the locks (as shown by the star symbol) and the color of the ITL entry.

As you can clearly see, when a transaction wants to update a specific row, it doesn’t have to go anywhere but the block header itself to know if the row is locked or not. All it has to do is to check the ITL slots. However ITL alone does not show with 100% accuracy that row is locked (again, something I will explain in a different installment). The transaction must go to the undo segment to check if the transaction has been committed. How does it know which specifci part of the undo segment to go to? Well, it has the information in the ITL entry. If the row is indeed locked, the transaction must wait and retry. As soon as the previous transaction ends, the undo information is updated and the waiting transaction completes its operation.

So, there is in fact a queue for the locks, but it’s at the block level, not at the level of the entire database or even the segment.

Demo

The proof is in the pudding. Let’s see all this through a demo. Now that you know the transaction entry, let’s see how it is stored in the block header. To do that, first, we need to know which blocks to look for. So, we should get the blocks numbers where the table is stored:

SQL> select file_id, relative_fno, extent_id, block_id, blocks
  2  from dba_extents
  3  where segment_name = 'ITLTEST';

   FILE_ID RELATIVE_FNO  EXTENT_ID   BLOCK_ID     BLOCKS
---------- ------------ ---------- ---------- ----------
         7            7          0       3576          8
         7            7          1       3968          8
         7            7          2       3976          8
         7            7          3       3984          8

To check inside the block, we need to “dump” the contents of the block to a tracefile so that we can read it. From a different session issue a checkpoint so that the buffer data is now written to the dis:

SQL> alter system checkpoint;

Now dump the data blocks 3576 through 3583.

SQL> alter system dump datafile 7 block min 3576 block max 3583;

System altered.

This will create a tracefile in the user dump destination directory. In case of Oracle 11g, the tracefile will be in the diag structure under /diag/rdbms///trace directory. It will be most likely the last tracefile generated. You can also get the precise name by getting the OS process ID of the session:

SQL> select p.spid
  2  from v$session s, v$process p
  3  where s.sid = (select sid from v$mystat where rownum < 2)
  4  and p.addr = s.paddr
  5  /

SPID
------------------------
9202

1 row selected.

Now look for a file named _ora_9202.trc. Open the file in vi and search for the phrase “Itl”. Here is an excerpt from the file:

Itl           Xid                  Uba          Flag  Lck        Scn/Fsc
0x01   0x000a.019.00007c05  0x00c00288.1607.0e  ----    1  fsc 0x0000.00000000
0x02   0x0003.017.00009e24  0x00c00862.190a.0f  C---    0  scn 0x0000.02234e2b

This is where the information on row locking is stored. Remember, the row locking information is known as Interested Transaction List (ITL) and each ITL is stored in a “slot”. Here it shows two slots, which is the default number. Look for the one where the “Lck” column shows a value. It shows “1”, meaning one of the rows in the blocks is locked by a transaction. But, which transaction? To get that answer, note the value under the “Xid” column. It shows the transaction ID – 0x000a.019.00007c05. These numbers are in hexadecimal (as indicated by the 0x at the beginning of the number). Using the scientific calculator in Windows, I converted the values to decimal as 10, 25 and 31749 respectively. Do they sound familiar? Of course they do; they are exactly as reported by both the record in v$transaction and the dbms_transaction.local_transaction_id function call.

This is how Oracle determines that there is a transaction has locked the row and correlates it to the various components in the other areas – mostly the undo segments to determne if it is active. Now that you know undo segments holds the transaction details, you may want to know more about the segment. Remember, the undo segment is just a segment, like any other table, indexes, etc. It resides in a tablespace, which is on some datafile. To find out the specifics of the segment, we will look into some more columns of the view V$TRANSACTION:
SQL> select addr, xidusn, xidslot, xidsqn, ubafil, ubablk, ubasqn, ubarec,
  2  status, start_time, start_scnb, start_scnw, ses_addr
  3  from v$transaction;

ADDR         XIDUSN    XIDSLOT     XIDSQN     UBAFIL     UBABLK     UBASQN
-------- ---------- ---------- ---------- ---------- ---------- ----------
    UBAREC STATUS           START_TIME           START_SCNB START_SCNW SES_ADDR
---------- ---------------- -------------------- ---------- ---------- --------
3F063C48         10         25      31749          3        648       5639
        14 ACTIVE           03/25/14 20:00:25      35868240          0 40A73784


1 row selected.

The columns with names starting with UBA show the undo block address information. Look at the above output. The UBAFIL shows the file#, which is “3” in this case. Checking for the file_id:

SQL> select * from dba_data_files
  2> where file_id = 3;

FILE_NAME
-------------------------------------------------------------------------
   FILE_ID TABLESPACE_NAME                     BYTES     BLOCKS STATUS
---------- ------------------------------ ---------- ---------- ---------
RELATIVE_FNO AUT   MAXBYTES  MAXBLOCKS INCREMENT_BY USER_BYTES USER_BLOCKS
------------ --- ---------- ---------- ------------ ---------- -----------
ONLINE_
-------
+DATA/d112d2/datafile/undotbs1.260.722742813
         3 UNDOTBS1                       4037017600     492800 AVAILABLE
           3 YES 3.4360E+10    4194302          640 4035969024      492672
ONLINE


1 row selected.

Note the UBASQN (which is the undo block sequence#) value in the earlier output, which was 5639. Let’s revisit the ITL entries in the dump of block:

Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000a.019.00007c05  0x00c00288.1607.0e  ----    1  fsc 0x0000.00000000
0x02   0x0003.017.00009e24  0x00c00862.190a.0f  C---    0  scn 0x0000.02234e2b

Look at the entry under the Uba column: 0x00c00288.1607.0e. As indicated by the “0x” at the beginning, these are in hexadecimal. Using a scientific calculator, let’s convert them. 1607 in hex means 5639 in decimal – the UBA Sequence# (UBASQN). The value “e” is 14 in decimal, which corresponds to the UBAREC. Finally the value 288 is 648 in decimal, which is the UBABLK. Now you see how the information is recorded in the block header and is also available to the DBA through the view V$TRANSACTION.

Let’s see some more important columns of the view. A typical database will have many sessions; not just one. Each session may have an active transaction, which means you have to link sessions to transactions to generate meaningful information. The transaction information also contains the session link. Note the column SES_ADDR, which is the address of the session that issued the transaction. From that, you can get the session information

SQL> select sid, username
  2  from v$session
  3  where saddr = '40A73784';

SID USERNAME
--- --------
123 ARUP

There you go – you now have the SID of the session. And now that you know the SID, you can look up any other relevant data on the session from the view V$SESSION.

Takeaways

Here is a summary of what you learned so far:

  1. Transaction in Oracle starts with a data update (or intention to update) statement. Actually there are some exceptions which we will cover in a later article.
  2. It ends when a commit or rollback is issued
  3. A transaction is identified by a transaction ID (XID) which is a set of three numbers – undo segment#, undo slot# and undo record# – separated by periods.
  4. You can view the transaction ID in the session itself by calling dbms_transaction.local_transaction_id function.
  5. You can also check all the active transactions in the view v$transaction, where the columns XIDUSN, XIDSLOT and XIDSQN denote the undo segment#, undo slot# and undo rec# – the values that make up the transaction ID.
  6. The transaction information is also stored in the block header. You can check it by dumping the block and looking for the term “Itl”.
  7. The v$transaction view also contains the session address under SES_ADDR column, which can be used to join with the SADDR column of v$session view to get the session details.
  8. From the session details, you can find out other actions by the session such as the username, the SQL issues, the machine issued from, etc.
Advertisements

How to capture business service inputs for use in Business Service Simulator

Often errors reported in Siebel CRM Application, such unexpected processing results, application error or process crash can be initiated calling a certain custom or standard Business Service (BS).

Analysis of Siebel log / FDR / SARM traces may show that failure of Siebel Object Manager was caused by calling a method of a Business Service.

In this document we want to illustrate how the input arguments of a Business Service that was invoked during execution of the case could be captured in a file,

that can be picked up by the Business Service simulator.

Note: Not all Business Services of Siebel CRM Application could be invoked directly by BS-Simulator.

Usually good candidates for simulation are custom Business Services (server-side scripting), standard EAI Toolset Business Services (EAI Siebel Adapter/EAI UI Data Apdater,

an ASI/UDS, an Outbound WS Proxy, various XML Converters, the EAI Dispatcher,  the EAI Data Transformation Engine and so on) and the “Workflow Process Manager” BS.

Simulation of the “RunProcess” method of the  “Workflow Process Manager” Business Service allows to execute a workflow process process

within the BS-Simulator session, while same method invoked for the “Workflow Process Manager (Server Request)” BS

makes the process execution happen in separate task of the “Workflow Process Manager” server component.

 

Paths (can be combined) to capture the BS method input arguments in a file, that can be loaded in BS-Simulator.

(A) The .dmp File – input property set stored as XML document

Setting to “True” of the advanced component parameter: “Enable Business Service Argument Tracing” (short alias: “EnableServiceArgTracing“)
allows Siebel Object Manager component task to produce *.dmp files with input arguments for a certain standard business services.

Example of the  command of Server Manager Command line  utility (“srvrmgr“) that amends this parameter for a given component (example: “EAI Object Manager (ENU)“):
change param EnableServiceArgTracing=True for comp EAIObjMgr_enu

Once the parameter is set to “True” one also need to raise the logging level to 5 for tracing of a desired component event type,
to component task to trace arguments of a given Business Service into a *.dmp file.

Following two event types could be found useful:

(1) “EAI Dispatcher Argument Tracing” (short alias: “EAIDispatchSvcArgTrc“)

to let Inbound Web Service Framework component produce  “EAITransportDispatchService_input_args_*.dmp” file with input arguments

(original SOAP Message and HTTP Transport headers) received from Web Service Client (via Siebel Web Server Extension).

The .dmp file can be used as input of the “GenericDispatch” method of the “EAI Dispatch Service” Business Service to mimic the Inbound Web Service call dispatch.

Example of the  command of Server Manager Command line  utility (“srvrmgr“) that amends logging level of this event type to 5 for a given component

(example: “EAI Object Manager (ENU)“):

change evtloglvl EAIDispatchSvcArgTrc=5  for comp EAIObjMgr_enu

NOTE: To use this file as input in  the BS-Simulation one need to rename its extension to “.xml“.

NOTE-2: After(!), loading the file in BS-Simulator View one need to insert/correct the xml-prologue to <?xml version=”1.0″ encoding=”UTF-16″?>

to indicate the unicode text encoding, used by  Siebel UI

(2)  “EAI Siebel Adapter Argument Tracing” (short alias: “EAISiebAdptSvcArgTrc“)

to let methods of the “EAI Siebel Adapter” standard Business Service to produce produce “EAISiebelAdapter_input_args_*.dmp” files   with input arguments of the invoked method.

The .dmp file can be used as input of accordant menthod of the “EAI Siebel Adapter” Business Service.

Example of the  command of Server Manager Command line  utility (“srvrmgr“) that amends  logging level of this event type to 5 for a given component

(example: “EAI Object Manager (ENU)“):

change evtloglvl EAISiebAdptSvcArgTrc=5  for comp EAIObjMgr_enu

NOTE: To use this file as input in  the BS-Simulation one need to rename its extension to “.xml

 

(B) The .log File - Siebel Object Manager traces of various event types
Input arguments can be also captured right from the Siebel Application log, whenever one sees strings (often long) that starts 
 with the prefix: @0*  and  contains argumenz names and values separated by <NN>* ,  where <NN> is legth of a text value AFTER the * .

This string represents an internal serialization of a property set hierarchy, used by Siebel Object Manager to trace the property set content as string.

The serialization and de-serialization of a property set into / from a string is performed by standard Business Service "Workflow Utilities" 
 and its methods: "PropSetToText" and "TextToPropSet".
Traces of the the "Workflow Process Execution" (short alias: "PrcExec") and "Workflow Step Execution" (short: alias: "StpExec") component events
 shows input and output arguments (property set) for a workflow process start and execution of each its steps.

Example of the  command of Server Manager Command line  utility ("srvrmgr") that amends this parameter for a given component 
(example: "EAI Object Manager (ENU)"):
    change evtloglvl PrcExec=5  for comp EAIObjMgr_enu
 change evtloglvl StpExec=5  for comp EAIObjMgr_enu  

NOTE: Oversized property set can not be completely held in  the trace, in this cases, the approach will not work, 
 Incomplete trace can be recognized if last text elemnt the string does not follow the <NN>*<C(NN)> rule
 whre <NN> is number of characters: <C..C> after the *.

Below is  example of the Object Manager log file that  withinput argument traces for a Workflow Process  and a Business Service step:
ObjMgrBusServiceLog … InvokeMethod 4 … Begin: Business Service ‘Workflow Process Manager’ invoke method: ‘RunProcess’ …

EngInv EngInv 3 … Workflow engine requested to perform method ‘RunProcess’.

EngInv Arg 4Input: @0*0*1*1*0*3*0*11*ProcessName15*TST_T1_EAIQuery4*1*13*SiebelMessage0*9*MessageId9*8SIA-IDGZ11*MessageType18*Integration Object13*IntObjectName26*Internal Account Interface15*IntObjectFormat19*Siebel Hierarchical0*1*32*ListOfInternal Account Interface0*37*6*7*Account0*16*PO Approved Flag0*30*PO Auto Approval Currency Code3*USD17*Skip Credit Check1*N23*Assignment Country Code0*11*DUNS Number0*19*Parent Account Name0*25*Price List Integration Id0*10*Account Id10*8SIA-7EIGB14*Account Status6*Active20*Assignment Area Code0*17*Main Phone Number10*415897654318*Credit Status Date0*13*Currency Code3*USD20*Primary Organization20*Default Organization23*Parent Account Location0*15*Competitor Flag1*N14*Parent HQ DUNS0*5*Alias0*14*Integration Id0*29*Parent Account Integration Id0*12*Partner Flag1*N26*Credit Auto Approval Limit0*22*PO Auto Approval Limit0*4*Type8*Customer22*Domestic Ultimate DUNS0*19*Number of Employees0*9*Home Page0*21*PO Auto Approval Date0*9*Expertise0*23*VAT registration number0*15*Main Fax Number0*8*Location12*HQ-Corporate20*Global Ultimate DUNS0*4*Name15*A.K Parker Inc.17*Parent Account Id0*10*Price List0*13*Price List Id0*0*1*22*ListOfBusiness Address0*11*0*16*Business Address0*12*IsPrimaryMVG1*Y16*Street Address 20*6*County0*4*City8*Danville14*Street Address17*100 Danville Blvd5*State2*CA7*Country3*USA10*Address Id8*1-1FIG+C22*Address Integration Id0*8*Province0*11*Postal Code5*945860*1*23*ListOfRelated Sales Rep0*6*0*17*Related Sales Rep0*8*Position42*FIN 00 30 – Financial Advisor Credit Cards8*Division20*Default Organization12*IsPrimaryMVG1*Y5*Login7*PCONNOR11*Position Id10*8SIA-70NAN23*Position Integration Id0*0*0*21*ListOfRelated Contact0*0*1*26*ListOfRelated Organization0*4*0*20*Related Organization0*12*IsPrimaryMVG1*Y15*Organization Id6*0-R9NH27*Organization Integration Id0*12*Organization20*Default Organization0*0*20*ListOfCredit Profile0*0*0*22*ListOfRelated Industry0*

PrcExec PrcExec 3 … Executing process definition ‘MyProcess’.

PrcExec Create 4 … Instantiating process definition ‘MyProcess‘.

PrcExec PropSet 4    …    @0*0*4*1*13*SiebelMessage0*9*MessageId…22*ListOfRelated Industry0*

StpExec Create  4    …    Instantiating step definition ‘Start‘.

StpExec Create 4 … Instantiating step definition ‘MyBSstep‘.

StpExec Task 4 … Invoking method ‘MyMethhod‘ on business service ‘MyBS‘.

StpExec TaskArg 4 … Input: @0*

 

To turn the string with serialized representation of a property set into .xml  file, so that can be used as input arguments with BS-Simulator,

following custom business service: “AUX_T2PS” (attached as the “AUX_T2PS.xml” file to this document) can be used,

The BS to combines calls of following standard BS::Methods

  • EAI File Transport::Receive” – to read the serialized property set string from a file (a UTF-8 text copied from a .log file and begins with @0*)
  • Workflow Utilities::TextToPropSet” – to de-serialize the string from a file into a property set hierarchy
  • EAI XML Write to File::WritePropSet” – to save property set into an .xml file, that can be loaded in BS-Simulator (input arguments applet).

The Business Service has single method (“Execute“) and following input arguments:

  • FileName” – to specify location / name of the input text wile with a serialized property set string that begins with @0*  (will be read)
  • FileName2″ – to specify location / name of the  output .xml file with the decoded property set (will be produced)
  • other input parameters are optional and if present, will be passed to the “Receive” method of the “EAI File Transport” BS

This Business Service can be defined as run-time BS (defined in Siebel Client and requires no .SRF compilation)

The Business Service code (for copy+paste into BS server code edit):

Service_PreInvokeMethod:
function Service_PreInvokeMethod (MethodName, Inputs, Outputs)
{
try
{
//
// Read text file with the property set (serilalized as string)
//
var psX = Inputs.Copy();
var                   sCP = “CharSetConversion”;
if(  psX.GetProperty( sCP ) == “” )
psX.SetProperty( sCP,”UTF-8″ );
TheApplication().GetService(“EAI File Transport”).InvokeMethod(“Receive”, psX, Outputs );

//
// De-serialize property set from string into a property set object
//
psX = Outputs.Copy();
TheApplication().GetService(“Workflow Utilities”).InvokeMethod(“TextToPropSet”, psX, Outputs );

//
// Write xml file with the property set (serialized as XML document)
//
psX = Outputs.Copy();
psX.SetType(“PropertySet”);
psX.SetProperty(“FileName”, Inputs.GetProperty( “FileName2”) );
TheApplication().GetService(“EAI XML Write to File”).InvokeMethod(“WritePropSet”, psX, Outputs );
}
finally
{
//
// Done.
//
psX =  null;
};
return (CancelOperation);
}

WARNING: The provided code example is given for illustration purpose only.

(C) the Service_PreInvokeMethod event handler (server side code)
Finally one can temporary put a server side code into the "Service_PreInvokeMethod" event handler of a Business Service, to capture input arguments of the invoked method.

Following provides an example of a Siebel eScript code, one could put in the business service in order to capture input arguments in an *.xml file. In this example, the file will be produced 
in the "../log" folder of the current Siebel Application directory (which is usually: "bin" folder of the software installation root). 
The name of the file will contain the Business Service name (with stripped blank characters), the method name, the sequential number - incremented at each call in
 this session and the random number, to provide unique (across all sessions) part to the file name.
(declarations):
var nCnt = 0;                                                        //dump files counter
var dmp_filePfx = “../log/” + this.Name().replace(/ /g,””) +  “_”;   //dump file prefix (folder + BS name)


Service_PreInvokeMethod:
function Service_PreInvokeMethod (MethodName, Inputs, Outputs)
{
try  //dump input arguments of any method
{
var psIn = Inputs.Copy();
psIn.SetProperty(“FileName“, dmp_filePfx + MethodName + “_” + ToString(++nCnt) + “_$$_input.xml” ); //provide the dump file name
psIn.SetType(“PropertySet“);
with(TheApplication() ) GetService(“EAI XML Write to File“).InvokeMethod(“WritePropSet”, psIn, NewPropertySet() ); //create the dump file
}
finally {   psIn = null; }; //cleanup
return(ContinueOperation);
}

NOTE: This code overwrites value of the “FileName” input argument if any would be provided the method invocation. To keep the original value of this input argument, one need to add extra code to obtain its value

and store into alternative parameter (such as :”Original_FileName”) and then, in the BS-Simulator View, correct the input argument name back to: “FileName”.

NOTE-2: An invocation of standard business service method may not necessary invoke the server side code of the event handler. Some specialized and internal business services may not presume use

of the scripting engine of Siebel Application. However in may “regular” cases the scripting engine is used to the custom code that will dump the input arguments, so that they could be used in the BS-Simulator.

WARNING: The provided code example is given for illustration purpose only.

Before this kind of solution could be placed in a production environment, it has to be verified/tested against potential data damage and loss.

 

Auxiliary Workflow Process to invoke a Business Service with complex arguments from Business Service Simulator View

In some cases, such as Siebel UI size constrains for input or output argument fields), need to use non UTF-16 encoding,
either input arguments cannot be loaded from an .xml file  or7and output arguments could not be displayed in the BS-Simulator View.

In such situation it is practical to define a 3 steps auxiliary workflow process or  a custom business service to that actual invokes being tested Business Service
so that initiation of simulation in the BS-Simulator View becomes just a start of this this auxiliary process (using the “RunProcess” method of the “Workflow Process Manager” BS) or business service.

The auxiliary process / business should:
– read the input arguments from an .xml file
– invoked the being tested method of the Business Service
– write the output arguments in an .xml file

The Business Service has single method (“Execute“) and following input arguments:

  • FileName” – to specify location / name of the input .xml with  with content of input argunents (will be read)
  • FileName2″ – to specify location / name of the  output .xml file with content of output argunents (will be produced)
  • BS” – to specify name of being tested Business Service
  • Method” – to specify name of the Business Service method
  • other input parameters are optional and if present, will be passed to the “Receive” method of the “EAI File Transport” BS

This Business Service can be defined as run-time BS (defined in Siebel Client and requires no .SRF compilation)

The Business Service code (for copy+paste into BS server code edit):

Service_PreInvokeMethod:
function Service_PreInvokeMethod (MethodName, Inputs, Outputs)
{
try
{
with( TheApplication() )
{
//
//  Load Input arguments
//
var psIn   = NewPropertySet();
GetService(“EAI XML Read from File“).InvokeMethod(“ReadPropSet”, Inputs, psIn ); //”FileName” in Inputs will be used to load ps from file
//
//  Invoke the Business Service
//
var sBS      = Inputs.GetProperty(“BS“);
var sMethod = Inputs.GetProperty(“Method“);
var psOut    = NewPropertySet();
GetService( sBS ).InvokeMethod( sMethod , psIn, psOut );
//
//  Save Output arguments
//
psOut.SetProperty(“FileName“, Inputs.GetProperty(“FileName2“) );
GetService(“EAI XML Write to File“).InvokeMethod(“WritePropSet”, psOut, Outputs ); //Outputs: will show result of saving ps in the file
}; // end-with: TheApplication
}
finally {   psIn = null; psOut = null; }; //cleanup
return(ContinueOperation);
}

Catch the Latch on Oracle DB

Waits on the cache buffer chains latch, ie the wait event “latch: cache buffers chains” happen when there is extremely high and concurrent access to the same block in a database. Access to a block is normally a fast operation but if concurrent users access a block fast enough, repeatedly then simple access to the block can become an bottleneck. The most common occurance of cbc (cache buffer chains) latch contention happens when multiple users are running nest loop joins on a table and accessing the table driven into via an index. Since the NL  join is basically a
  For all rows in i
     look up a value in j  where j.field1 = i.val
  end loop
then table j’s index on field1 will get hit for every row returned from i. Now if the lookup on i returns a lot of rows and if multiple users are running this same query then the index root block is going to get hammered on the index j(field1).
In order to solve a CBC latch bottleneck we need to know what SQL is causing the bottleneck and what table or index that the SQL statement is using is causing the bottleneck.
From ASH data this is fairly easy:
select 

      count(*), 
      sql_id, 
      nvl(o.object_name,ash.current_obj#) objn,
      substr(o.object_type,0,10) otype,
      CURRENT_FILE# fn,
      CURRENT_BLOCK# blockn
from  v$active_session_history ash
    , all_objects o
where event like 'latch: cache buffers chains'
  and o.object_id (+)= ash.CURRENT_OBJ#
group by sql_id, current_obj#, current_file#,
               current_block#, o.object_name,o.object_type
order by count(*)
/               
From the out put it looks like we have both the SQL (at least the id, we can get the text with the id) and the block:
CNT SQL_ID        OBJN     OTYPE   FN BLOCKN

---- ------------- -------- ------ --- ------
  84 a09r4dwjpv01q MYDUAL   TABLE    1  93170
But the block actually is probably left over from a recent IO and not actually the CBC hot block though it might be.
We can investigate further to get more information by looking at P1, P2 and P3 for the CBC latch wait. How can we find out what P1, P2 and P3 mean? by looking them up in V$EVENT_NAME:
select * from v$event_name

where name = 'latch: cache buffers chains'
EVENT#     NAME                         PARAMETER1 PARAMETER2 PARAMETER3 
---------- ---------------------------- ---------- ---------- ----------
        58 latch: cache buffers chains     address     number      tries 

So  P1 is the address of the latch for the cbc latch wait.

Now we can group the CBC latch waits by the address and find out what address had the most waits:
select

    count(*),
    lpad(replace(to_char(p1,'XXXXXXXXX'),' ','0'),16,0) laddr
from v$active_session_history
where event='latch: cache buffers chains'
group by p1
order by count(*);   
COUNT(*)  LADDR

---------- ----------------
      4933 00000004D8108330   
In this case, there is only one address that we had waits for, so now we can look up what blocks (headers actually) were at that address
select o.name, bh.dbarfil, bh.dbablk, bh.tch

from x$bh bh, obj$ o
where tch > 5
  and hladdr='00000004D8108330'
  and o.obj#=bh.obj
order by tch
NAME        DBARFIL DBABLK  TCH

----------- ------- ------ ----
EMP_CLUSTER       4    394  120        
We look for the block with the highest “TCH” or “touch count”. Touch count is a count of the times the block has been accesses. The count has some restrictions. The count is only incremented once every 3 seconds, so even if I access the block 1 million times a second, the count will only go up once every 3 seconds. Also, and unfortunately, the count gets zeroed out if the block cycles through the buffer cache, but probably the most unfortunate is that  this analysis only works when the problem is currently happening. Once the problem is over then the blocks will usually get pushed out of the buffer cache.
In the case where the CBC latch contention is happening right now we can run all of this analysis in one query
select 

        name, file#, dbablk, obj, tch, hladdr 
from x$bh bh
    , obj$ o
 where 
       o.obj#(+)=bh.obj and
      hladdr in 
(
    select ltrim(to_char(p1,'XXXXXXXXXX') )
    from v$active_session_history 
    where event like 'latch: cache buffers chains'
    group by p1 
    having count(*) > 5
)
   and tch > 5
order by tch   
example output
NAME          FILE# DBABLK    OBJ TCH HLADDR

------------- ----- ------ ------ --- --------
BBW_INDEX         1 110997  66051  17 6BD91180
IDL_UB1$          1  54837     73  18 6BDB8A80
VIEW$             1   6885     63  20 6BD91180
VIEW$             1   6886     63  24 6BDB8A80
DUAL              1   2082    258  32 6BDB8A80
DUAL              1   2081    258  32 6BD91180
MGMT_EMD_PING     3  26479  50312 272 6BDB8A80
This can be misleading, as TCH gets set to 0 every rap around the LRU and it only gets updated once every 3 seconds, so in this case DUAL was my problem table not MGMT_EMD_PING
Deeper Analysis from Tanel Poder
Using Tanel’s ideas here’s a script to get the objects that we have the most cbc latch waits on
col object_name for a35
col cnt for 99999
SELECT
  cnt, object_name, object_type,file#, dbablk, obj, tch, hladdr
FROM (
  select count(*) cnt, rfile, block from (
    SELECT /*+ ORDERED USE_NL(l.x$ksuprlat) */
      –l.laddr, u.laddr, u.laddrx, u.laddrr,
      dbms_utility.data_block_address_file(to_number(object,’XXXXXXXX’)) rfile,
      dbms_utility.data_block_address_block(to_number(object,’XXXXXXXX’)) block
    FROM
       (SELECT /*+ NO_MERGE */ 1 FROM DUAL CONNECT BY LEVEL <= 100000) s,
       (SELECT ksuprlnm LNAME, ksuprsid sid, ksuprlat laddr,
       TO_CHAR(ksulawhy,’XXXXXXXXXXXXXXXX’) object
        FROM x$ksuprlat) l,
       (select  indx, kslednam from x$ksled ) e,
       (SELECT
                    indx
                  , ksusesqh     sqlhash
  , ksuseopc
  , ksusep1r laddr
             FROM x$ksuse) u
    WHERE LOWER(l.Lname) LIKE LOWER(‘%cache buffers chains%’)
     AND  u.laddr=l.laddr
     AND  u.ksuseopc=e.indx
     AND  e.kslednam like ‘%cache buffers chains%’
    )
   group by rfile, block
   ) objs,
     x$bh bh,
     dba_objects o
WHERE
      bh.file#=objs.rfile
 and  bh.dbablk=objs.block
 and  o.object_id=bh.obj
order by cnt
;
CNT  OBJECT_NAME       TYPE  FILE#  DBABLK    OBJ   TCH  HLADDR

---- ----------------- ----- ----- ------- ------ ----- --------
   1 WB_RETROPAY_EARNS TABLE     4   18427  52701  1129 335F7C00
   1 WB_RETROPAY_EARNS TABLE     4   18194  52701  1130 335F7C00
   3 PS_RETROPAY_RQST  TABLE     4   13253  52689  1143 33656D00
   3 PS_RETROPAY_RQST  INDEX     4   13486  52692   997 33656D00
   3 WB_JOB            TABLE     4   14443  52698   338 335B9080
   5 PS_RETROPAY_RQST  TABLE     4   13020  52689   997 33656D00
   5 WB_JOB            TABLE     4   14676  52698   338 335B9080
   5 WB_JOB            TABLE     4   13856  52698   338 335F7C00
   6 WB_JOB            TABLE     4   13623  52698   338 335F7C00
   7 WB_JOB            TABLE     4   14909  52698   338 335B9080
 141 WB_JOB            TABLE     4   15142  52698   338 335B9080
2513 WB_JOB            INDEX     4   13719  52699   997 33656D00
Why do we get cache buffers chains latch contention?
In order to understand why we get CBC latch contention we have to understand what the CBC latch protects. The CBC latch protects information controlling the buffer cache. Here is a schematic of computer memory and the Oracle processes, SGA and the main components of the SGA:
The buffer cache holds in memory versions of datablocks for faster access. Can you imagine though how we find a block we want in the buffer cache? The buffer cache doesn’t have a index of blocks it contains and we certainly don’t scan the whole cache looking for the block we want (though I have heard that as a concern when people increase the size of there buffer cache). The way we find a block in the buffer cache is by taking the block’s address, ie it’s file and block number and hashing it. What’s hashing? A simple example of hashing is  the “Modulo” function
1 mod 4 = 1

2 mod 4 = 2
3 mod 4 = 3
4 mod 4 = 0
5 mod 4 = 1
6 mod 4 = 2
7 mod 4 = 3
8 mod 4 = 0
Using “mod 4” as a hash funtion creates 4 possible results. These results are used by Oracle as “buckets” or identifiers of locations to store things. The things in this case will be block headers.
Block headers are meta data about data block including pointers to the actual datablock as well as pointers to the other headers in the same bucket.
The block headers in the hash buckets are connected via a doubly linked list. One link points forward the other points backwards
The resulting layout looks like
the steps to find a block in the cache are
If there are a lot of sessions concurrently accessing the same buffer header (or buffer headers in the same bucket) then the latch that protects that bucket will get hot and users will have to wait getting “latch: cache buffers chains” wait.
Two ways this can happen (among probably several others)
For the nested loops example, Oracle will in some (most?) cases try and pin the root block of the index because Oracle knows we will be using it over and over. When a block is pinned we don’t have to use the cbc latch. There seem to be cases (some I think might be bugs) where the root block doesn’t get pinned. (I want to look into this more – let me know if you have more info)
One thing that can make CBC latch contention worse is if a session is modifying the data block that users are reading because readers will clone a block with uncommitted changes and roll back the changes in the cloned block:
all these clone copies will go in the same bucket and be protected by the same latch:

How many copies of a block are in the cache?

select 

       count(*)
     , name
     , file#
     , dbablk
     , hladdr 
from   x$bh bh
          , obj$ o
where 
      o.obj#(+)=bh.obj and
      hladdr in 
(
    select ltrim(to_char(p1,'XXXXXXXXXX') )
    from v$active_session_history 
    where event like 'latch: cache%'
    group by p1 
)
group by name,file#, dbablk, hladdr
having count(*) > 1
order by count(*);
CNT NAME        FILE#  DBABLK HLADDR

--- ---------- ------ ------- --------
 14 MYDUAL          1   93170 2C9F4B20
Notice that the number of copies, 14, is higher the the max number of copies allowed set by “_db_block_max_cr_dba = 6” in 10g. The reason is this value is just a directive not a restriction. Oracle tries to limit the  number of copies.
Solutions

Find SQL ( Why is application hitting the block so hard? )

Possibly change application logic

Eliminate hot spots

Nested loops, possibly

Hash Partition the index with hot block

Use Hash Join instead of Nested loop join
Use Hash clusters

Look up tables (“select language from lang_table where …”)

Change application
Use plsql function
Spread data out to reduce contention, like set PCTFREE to 0 and recreate the table so that there is only one row per block

Select from dual

Possibly use x$dual
Note starting in 10g Oracle uses the “fast dual” table (ie x$dual) automatically when executing a query on dual as long as the column “dummy” is not accessed. Accessing dummy would be cases like
    select count(*) from dual;
    select * from dual;
    select dummy from dual;
an example of not accessing “dummy” would be:
    select 1 from dual;
    select sysdate from dual;

Updates, inserts , select for update on blocks while reading those blocks