Samstag, 24. Juni 2017

Method R Profiler 7.0 available

I'm seldom doing product announcements on my blog. So there must be a reason why I do not follow this rule.

Method-R recently released a new version of it's powerful SQL-Trace profiler.
Of course there are many improvements on the profiler engine itself, but as the prodict was very stable already they will only help in rare edge cases, most people would not benefit dramatically.

But there is a new featuere which brings a lot of added value to the tool and improves time to repair/fix/enlightenment for many readers of the trace file:
They added an explanatory text to many sections.
Here an example (from their sample file - slightly edited to better fit the size of this blog):

Your experience duration was dominated by “cell single block physical read” and “CPU: FETCH dbcalls” calls. These are all high-productivity calls; their dominance means that the Oracle Database is working hard for your application. Now you should determine whether all of that work is actually necessary. Consult the Profile by Cursor and Profile by Statement sections for the next step of your diagnosis.
SubroutineDurationCallsDuration per call (seconds)
secondsRmeanminskewmax
1cell single block physical read4.64469.9%6,8990.00070.0005▁▁▁█▁▁▁▁▁▁▁0.028
2CPU: FETCH dbcalls2.38135.8%7,9470.00030.0000▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁█▁▁2.071

Unfortunately I did not manage to copy all the nice formatting into this blog, so i created a screenshot to show the original picture.

As you can see there is a new text with more in detail explanation about the Subroutine and where to go next for deeper analysis. I see this as a big advantage for everyone who does not work with SQL-Trace files every day. (well, even for those it's helpful).

There is a very nice demo version available, so if you have a trace file to analyze and want to check, if you see added value by this suggestion, give it a try.

Beside the new Profiler, Method-R has redone their web page as well (and I expect more news to follow). It's worth to invest some minutes and have a glimpse.

Dienstag, 11. April 2017

search for the missing ORA-04068

Oracle Active Dataguard is a great software for offloading read only activities from the primary nodes to nodes which is not utilized anyhow (at least during non-disaster times).
A very good example for doing so is regular extraction of data for processing in a data warehouse. That's also the case which led to this post.

The setting is just slightly more complex: both, primary and standby site are RACs - of course when you plan to switchover/failover, they should be somehow similar. (And in case of a disaster, it's planned to failover an disable all not-so-important activities; warehouse extract would be one of those).

Also the offloaded activities are slightly more complex than a simple query. In this case PL/SQL code is included. According to the documentation, that's not an issue at all - it's still read only. But in this DB (11.2.0.4), it was an issue: Sometimes the result was not as expected.

Now I first need to define the expectation: in this special case, the ERROR is the expectation, and no error the wrong result. Whenever a package (with global variables) is changed in a session, all other sessions which initialized the package before the change, but called it afterwards, must get

04068, 00000, "existing state of packages%s%s%s has been discarded"
// *Cause:  One of errors 4060 - 4067 when attempt to execute a stored
//          procedure.
// *Action: Try again after proper re-initialization of any application's
//          state.

Then the application can catch this error, re-initiate the package and continue.
that's how it should be.

But we sometimes had strange results in the test environment. After some investigation, we found it and I simplified it to this Testcase:

Testcase: 
(results are only shown when of any interest) 
INST1_ ... Instance1
INST2_ ... Instance2
PRI ... Primary DB 
ADG ... Active DataGuard 
it's important the sessions are not closed within the test!

prepare user:
SYS@INST1_PRI:
=============
-- create the user 
create user SR_TEST1 identified by "xxx" default tablespace users temporary tablespace temp; 
grant connect, create session, create procedure, create table to SR_TEST1; 
alter user SR_TEST1 quota unlimited on users; 

SESSION1@INST1_PRI:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 

create table tab1 (id number, ver varchar2(30)); 
insert into tab1 values (1, '_'); 
commit; 

create or replace PACKAGE PACK AS 

first_load date := to_date('3333-01-01 01:01:01', 'YYYY-MM-DD HH24:MI:SS'); 

PROCEDURE proc; 

END PACK; 
/ 

CREATE OR REPLACE 
PACKAGE BODY PACK AS 

PROCEDURE proc AS 
BEGIN 
if first_load > sysdate then 
first_load := sysdate; 
end if; 
DBMS_OUTPUT.PUT_LINE('1: loaded first on ' || to_char(first_load,'YYYY-MM-DD HH24:MI:SS') ); 
NULL; 
END proc; 

END PACK; 
/ 
update tab1 set ver='1' where id=1; 
commit; 

create or replace procedure proc1 
is 
begin 
SR_TEST1.pack.proc; 
end; 
/ 

-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 
select * from tab1; 

> 1: loaded first on 2017-03-15 09:52:00 
> 
> PL/SQL procedure successfully completed. 
> 
select * from tab1; 
> 
> ID VER 
>---------- ------------------------------ 
> 1 1 

SESSION2@INST2_PRI:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 

-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 
select * from tab1; 

> 1: loaded first on 2017-03-15 09:53:00 
> 
> PL/SQL procedure successfully completed. 
> 
select * from tab1; 
> 
> ID VER 
>---------- ------------------------------ 
> 1 1 

SESSION3@INST1_ADG:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 
-- here the redo apply is running
select sid from v$session where program like '%MRP0%';

>       SID
>----------
>         2


-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

> 1: loaded first on 2017-03-15 09:53:21 
> 
> PL/SQL procedure successfully completed. 
> 
> select * from tab1; 
> 
> ID VER 
> ---------- ------------------------------ 
> 1 1 

SESSION4@INST1_ADG:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 
-- NO redo apply is running hereselect inst_id, sid from gv$session where program like '%MRP0%';

> no rows selected

-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

> 1: loaded first on 2017-03-15 09:54:00 
> 
> PL/SQL procedure successfully completed. 
> 
> select * from tab1; 
> 
> ID VER 
> ---------- ------------------------------ 
> 1 1 

SESSION1@INST1_PRI:
==================
-- change package body 

CREATE OR REPLACE 
PACKAGE BODY PACK AS 

PROCEDURE proc AS 
BEGIN 
if first_load > sysdate then 
first_load := sysdate; 
end if; 
DBMS_OUTPUT.PUT_LINE('2: loaded first on ' || to_char(first_load,'YYYY-MM-DD HH24:MI:SS') ); 
NULL; 
END proc; 

END PACK; 
/ 
update tab1 set ver='2' where id=1; 
commit; 


-- test 
set serveroutput on 
exec SR_TEST1.proc1 
select * from tab1; 

> 2: loaded first on 2017-03-15 09:55:51 
> 
> PL/SQL procedure successfully completed. 
> 
> 
> ID VER 
> ---------- ------------------------------ 
> 1 2 

SESSION2@INST2_PRI:
==================
-- test 
set serveroutput on 
exec SR_TEST1.proc1 
SR_TEST1@EBSSID051 > BEGIN SR_TEST1.proc1; END; 

* 
ERROR at line 1: 
ORA-04068: existing state of packages has been discarded 
ORA-04061: existing state of package body "SR_TEST1.PACK" has been invalidated 
ORA-04065: not executed, altered or dropped package body "SR_TEST1.PACK" 
ORA-06508: PL/SQL: could not find program unit being called: "SR_TEST1.PACK" 
ORA-06512: at "SR_TEST1.PROC1", line 4 
ORA-06512: at line 1 


> select * from tab1; 

ID VER 
---------- ------------------------------ 
1 2 

-- THIS Is the expected result 

SESSION3@INST1_ADG:
==================
-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

* 
ERROR at line 1: 
ORA-04068: existing state of packages has been discarded 
ORA-04061: existing state of package body "SR_TEST1.PACK" has been invalidated 
ORA-04065: not executed, altered or dropped package body "SR_TEST1.PACK" 
ORA-06508: PL/SQL: could not find program unit being called: "SR_TEST1.PACK" 
ORA-06512: at "SR_TEST1.PROC1", line 4 
ORA-06512: at line 1 

select * from tab1; 

ID VER 
---------- ------------------------------ 
1 2 

SESSION4@INST1_ADG:
==================
-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

1: loaded first on 2017-03-15 09:54:00 

PL/SQL procedure successfully completed. 

select * from tab1; 

ID VER 
---------- ------------------------------ 
1 2 
-- HERE you see the update on the table is applied on ADG already, but the procedure output is still prefixed with 1: - not 2: as it should be.

SESSION4@INST2_ADG:
==================
-- to check: the package is there! 
set pages 99 
select text from all_source where name = 'PACK' 
and type = 'PACKAGE BODY' 
3 order by line; 

TEXT 
------------------------
PACKAGE BODY PACK AS 

PROCEDURE proc AS 
BEGIN 
if first_load > sysdate then 
first_load := sysdate; 
end if; 
DBMS_OUTPUT.PUT_LINE('2: loaded first on ' || to_char(first_load,'YYYY-MM-DD HH24:MI:SS') ); 

NULL; 
END proc; 

END PACK; 

12 rows selected. 

Even the testcase is clear and simple, it was not that easy to identify the root cause in a more complex life environment. Special thanks to Andy Sayer who helped me to sort and refine my ideas over twitter.

With this testcase I was able to open a SR at Oracle, and after some some Support-ping-pong I got a useful information:
It is known as Bug: 18357555: ADD SUPPORT FOR KGLIV_BRNULPALL TO KGLCLUSTERMESSAGE() / KQLMBIVG()

I'm only aware of one Patch:25808607 (for PSU:11.2.0.4.160419 - together with some other patches) which is available right now. But you can ask for your own, if you have an Active Dataguard and RAC, and Packages with global variables.

Mittwoch, 5. April 2017

Can I flush the shared pool of all RAC Instances in an Active DataGuard?

That's a good question.

And the answer is simple: Yes, but NO!

Well, ok; that's not very helpful. But whenever is anything simple in modern IT?

So I'll show why Yes, it's possible:

first as SYS create a dummy user without special permissions and a function to drop the shared pool:

create user berx identified by xxx;
grant connect to berx;
grant select on v_$instance to berx;
    
create or replace function sys.flush_shared_pool_bx return varchar2
as
begin
        execute immediate 'alter system flush shared_pool';
  return 'Flush_Shared_Pool executed on ' || sysdate || ' by ' || sys_context('USERENV', 'SESSION_USER');

end;
/
grant execute on sys.flush_shared_pool_bx  to public;

SELECT * FROM TABLE(gv$(CURSOR(
  SELECT INSTANCE_NUMBER, sys.flush_shared_pool_bx flush_result
  FROM v$instance)));  

Unfortunately I can not use the gv$ function as user berx directly.

So a view does the trick:
create view v_flush_shared_pool_bx
as 
SELECT * FROM TABLE(gv$(CURSOR(
  SELECT INSTANCE_NUMBER, sys.flush_shared_pool_bx flush_result
  FROM v$instance))); 

grant select on v_flush_shared_pool_bx to public;

create or replace
function f_gv_flush
return DBMS_DEBUG_VC2COLL
PIPELINED 
IS
  res varchar2(120);
   cursor c1 is
     SELECT INSTANCE_NUMBER || ' - ' || FLUSH_RESULT as in_result
 FROM TABLE(gv$(CURSOR(
  SELECT INSTANCE_NUMBER, sys.flush_shared_pool_bx flush_result
  FROM v$instance)));
begin
  FOR res_row in c1
  LOOP
      pipe row (to_char(res_row.in_result)) ;
  END LOOP;
  return;
end;
/

and now it's so simple to flush on all instances:

connect berx/

select * from sys.v_flush_shared_pool_bx;
  
INSTANCE_NUMBER FLUSH_RESULT
--------------- --------------------------------------------------
              1 Flush_Shared_Pool executed on 10-APR-17 by BERX
              3 Flush_Shared_Pool executed on 10-APR-17 by BERX
              2 Flush_Shared_Pool executed on 10-APR-17 by BERX


or

select * from table(sys.f_gv_flush);

COLUMN_VALUE
-----------------------------------------------------
1 - Flush_Shared_Pool executed on 10-APR-17 by BERX
2 - Flush_Shared_Pool executed on 10-APR-17 by BERX
3 - Flush_Shared_Pool executed on 10-APR-17 by BERX

So obviously the answer if I an flush shared pool in all RAC instances - even in Active DataGuard is: Yes.

But at the same time the answer is NO, as I'm using the gv$ function - and this is not documented, so it's not wise to use it.

If it was not about an Active DataGuard, I'd use a scheduler job to run the flush shared_pool on all instances, but as the ADG is read only, I can't use it.

If you have another idea, how to do this (only with oracle means, no cronjob or similar), please let me know!

Mittwoch, 8. März 2017

force connections to use SERVICE_NAME

During the setup of a project I had a small observation: The Oracle Database is running on a 4 node RAC, but we (DBAs) wanted to provide only 2 instances to the application team.

So we created a dedicated service with Preferred instances: inst1,inst2 and Available instances: inst3.inst4

But after some minutes I saw sessions running on all 4 instances. Those sessions used the Service SYS$USERS - so they managed to "guess" the SID and created a connetionstring with SID, not SERVICE_NAME.

We contacted them and asked to use the connection string we provided (with SERVICE_NAME) which was no problem at all.

But the question remains - how to avoid connections with SID?

One idea was to create a after logon on database trigger and there check for special allowed users, hosts, or whatever. This seems fine, but there are situations, where SYS$USERS are used by default even from users, which should not use it for connections. One example are scheduler jobs. this can be solved by changing the SERVICE of DEFAULT_JOB_CLASS to the desired service:
DBMS_SCHEDULER.SET_ATTRIBUTE ('DEFAULT_JOB_CLASS','SERVICE','PREFERRED_SERVICE');

This requires additional actions and care - and a good documentation to avoid future issues.

As my initial question was to stop connections using SID through the listener, not interfere with generic bequeath connections. So I focused on the listener:

Even the SID is registered automatically when an instance registers to the listener, it can be specified manually as well.
So I tried to abuse this feature a little bit.

My config:
SID_LIST_LISTENER=
  (SID_LIST=
    (SID_DESC=
      (ORACLE_HOME=/tmp/berx1)
      (SID_NAME=ORCLSID))
)

but the listener searches for an oracle binary in this fake ORACLE_HOME
TNS-01201: Listener cannot find executable /tmp/berx1/bin/oracle for SID EDWP123
so I created a binary there:
ln -s /bin/false  /tmp/berx1/bin/oracle
and now the listener started fine.

A connection attempt using the SID fails now with
ERROR:
ORA-12537: TNS:connection closed

and the listener.log shows the errors
08-MAR-2017 11:14:20 * (CONNECT_DATA=(SID=ORCLSID)(CID=(PROGRAM=sqlplus)(HOST=myhost)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=1.2.3.4)(PORT=50714)) * establish * ORCLSID * 12518
TNS-12518: TNS:listener could not hand off client connection
 TNS-12547: TNS:lost contact
  TNS-12560: TNS:protocol adapter error
   TNS-00517: Lost contact
    Linux Error: 32: Broken pipe

That's more or less a solution to my question.
Of course I do not recommend to implement it, at least not without proper testing.
Especially in a DataGuard setup a SID connection is required for some activities, so there  a dedicated listener for DataGuard activities is required in such a configuration.