2017-04-11

search for the missing ORA-04068

Oracle Active Dataguard is a great software for offloading read only activities from the primary nodes to nodes which is not utilized anyhow (at least during non-disaster times).
A very good example for doing so is regular extraction of data for processing in a data warehouse. That's also the case which led to this post.

The setting is just slightly more complex: both, primary and standby site are RACs - of course when you plan to switchover/failover, they should be somehow similar. (And in case of a disaster, it's planned to failover an disable all not-so-important activities; warehouse extract would be one of those).

Also the offloaded activities are slightly more complex than a simple query. In this case PL/SQL code is included. According to the documentation, that's not an issue at all - it's still read only. But in this DB (11.2.0.4), it was an issue: Sometimes the result was not as expected.

Now I first need to define the expectation: in this special case, the ERROR is the expectation, and no error the wrong result. Whenever a package (with global variables) is changed in a session, all other sessions which initialized the package before the change, but called it afterwards, must get

04068, 00000, "existing state of packages%s%s%s has been discarded"
// *Cause:  One of errors 4060 - 4067 when attempt to execute a stored
//          procedure.
// *Action: Try again after proper re-initialization of any application's
//          state.

Then the application can catch this error, re-initiate the package and continue.
that's how it should be.

But we sometimes had strange results in the test environment. After some investigation, we found it and I simplified it to this Testcase:

Testcase: 
(results are only shown when of any interest) 
INST1_ ... Instance1
INST2_ ... Instance2
PRI ... Primary DB 
ADG ... Active DataGuard 
it's important the sessions are not closed within the test!

prepare user:
SYS@INST1_PRI:
=============
-- create the user 
create user SR_TEST1 identified by "xxx" default tablespace users temporary tablespace temp; 
grant connect, create session, create procedure, create table to SR_TEST1; 
alter user SR_TEST1 quota unlimited on users; 

SESSION1@INST1_PRI:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 

create table tab1 (id number, ver varchar2(30)); 
insert into tab1 values (1, '_'); 
commit; 

create or replace PACKAGE PACK AS 

first_load date := to_date('3333-01-01 01:01:01', 'YYYY-MM-DD HH24:MI:SS'); 

PROCEDURE proc; 

END PACK; 
/ 

CREATE OR REPLACE 
PACKAGE BODY PACK AS 

PROCEDURE proc AS 
BEGIN 
if first_load > sysdate then 
first_load := sysdate; 
end if; 
DBMS_OUTPUT.PUT_LINE('1: loaded first on ' || to_char(first_load,'YYYY-MM-DD HH24:MI:SS') ); 
NULL; 
END proc; 

END PACK; 
/ 
update tab1 set ver='1' where id=1; 
commit; 

create or replace procedure proc1 
is 
begin 
SR_TEST1.pack.proc; 
end; 
/ 

-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 
select * from tab1; 

> 1: loaded first on 2017-03-15 09:52:00 
> 
> PL/SQL procedure successfully completed. 
> 
select * from tab1; 
> 
> ID VER 
>---------- ------------------------------ 
> 1 1 

SESSION2@INST2_PRI:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 

-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 
select * from tab1; 

> 1: loaded first on 2017-03-15 09:53:00 
> 
> PL/SQL procedure successfully completed. 
> 
select * from tab1; 
> 
> ID VER 
>---------- ------------------------------ 
> 1 1 

SESSION3@INST1_ADG:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 
-- here the redo apply is running
select sid from v$session where program like '%MRP0%';

>       SID
>----------
>         2


-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

> 1: loaded first on 2017-03-15 09:53:21 
> 
> PL/SQL procedure successfully completed. 
> 
> select * from tab1; 
> 
> ID VER 
> ---------- ------------------------------ 
> 1 1 

SESSION4@INST1_ADG:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 
-- NO redo apply is running hereselect inst_id, sid from gv$session where program like '%MRP0%';

> no rows selected

-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

> 1: loaded first on 2017-03-15 09:54:00 
> 
> PL/SQL procedure successfully completed. 
> 
> select * from tab1; 
> 
> ID VER 
> ---------- ------------------------------ 
> 1 1 

SESSION1@INST1_PRI:
==================
-- change package body 

CREATE OR REPLACE 
PACKAGE BODY PACK AS 

PROCEDURE proc AS 
BEGIN 
if first_load > sysdate then 
first_load := sysdate; 
end if; 
DBMS_OUTPUT.PUT_LINE('2: loaded first on ' || to_char(first_load,'YYYY-MM-DD HH24:MI:SS') ); 
NULL; 
END proc; 

END PACK; 
/ 
update tab1 set ver='2' where id=1; 
commit; 


-- test 
set serveroutput on 
exec SR_TEST1.proc1 
select * from tab1; 

> 2: loaded first on 2017-03-15 09:55:51 
> 
> PL/SQL procedure successfully completed. 
> 
> 
> ID VER 
> ---------- ------------------------------ 
> 1 2 

SESSION2@INST2_PRI:
==================
-- test 
set serveroutput on 
exec SR_TEST1.proc1 
SR_TEST1@EBSSID051 > BEGIN SR_TEST1.proc1; END; 

* 
ERROR at line 1: 
ORA-04068: existing state of packages has been discarded 
ORA-04061: existing state of package body "SR_TEST1.PACK" has been invalidated 
ORA-04065: not executed, altered or dropped package body "SR_TEST1.PACK" 
ORA-06508: PL/SQL: could not find program unit being called: "SR_TEST1.PACK" 
ORA-06512: at "SR_TEST1.PROC1", line 4 
ORA-06512: at line 1 


> select * from tab1; 

ID VER 
---------- ------------------------------ 
1 2 

-- THIS Is the expected result 

SESSION3@INST1_ADG:
==================
-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

* 
ERROR at line 1: 
ORA-04068: existing state of packages has been discarded 
ORA-04061: existing state of package body "SR_TEST1.PACK" has been invalidated 
ORA-04065: not executed, altered or dropped package body "SR_TEST1.PACK" 
ORA-06508: PL/SQL: could not find program unit being called: "SR_TEST1.PACK" 
ORA-06512: at "SR_TEST1.PROC1", line 4 
ORA-06512: at line 1 

select * from tab1; 

ID VER 
---------- ------------------------------ 
1 2 

SESSION4@INST1_ADG:
==================
-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

1: loaded first on 2017-03-15 09:54:00 

PL/SQL procedure successfully completed. 

select * from tab1; 

ID VER 
---------- ------------------------------ 
1 2 
-- HERE you see the update on the table is applied on ADG already, but the procedure output is still prefixed with 1: - not 2: as it should be.

SESSION4@INST2_ADG:
==================
-- to check: the package is there! 
set pages 99 
select text from all_source where name = 'PACK' 
and type = 'PACKAGE BODY' 
3 order by line; 

TEXT 
------------------------
PACKAGE BODY PACK AS 

PROCEDURE proc AS 
BEGIN 
if first_load > sysdate then 
first_load := sysdate; 
end if; 
DBMS_OUTPUT.PUT_LINE('2: loaded first on ' || to_char(first_load,'YYYY-MM-DD HH24:MI:SS') ); 

NULL; 
END proc; 

END PACK; 

12 rows selected. 

Even the testcase is clear and simple, it was not that easy to identify the root cause in a more complex life environment. Special thanks to Andy Sayer who helped me to sort and refine my ideas over twitter.

With this testcase I was able to open a SR at Oracle, and after some some Support-ping-pong I got a useful information:
It is known as Bug: 18357555: ADD SUPPORT FOR KGLIV_BRNULPALL TO KGLCLUSTERMESSAGE() / KQLMBIVG()

I'm only aware of one Patch:25808607 (for PSU:11.2.0.4.160419 - together with some other patches) which is available right now. But you can ask for your own, if you have an Active Dataguard and RAC, and Packages with global variables.

2017-04-05

Can I flush the shared pool of all RAC Instances in an Active DataGuard?

That's a good question.

And the answer is simple: Yes, but NO!

Well, ok; that's not very helpful. But whenever is anything simple in modern IT?

So I'll show why Yes, it's possible:

first as SYS create a dummy user without special permissions and a function to drop the shared pool:

create user berx identified by xxx;
grant connect to berx;
grant select on v_$instance to berx;
    
create or replace function sys.flush_shared_pool_bx return varchar2
as
begin
        execute immediate 'alter system flush shared_pool';
  return 'Flush_Shared_Pool executed on ' || sysdate || ' by ' || sys_context('USERENV', 'SESSION_USER');

end;
/
grant execute on sys.flush_shared_pool_bx  to public;

SELECT * FROM TABLE(gv$(CURSOR(
  SELECT INSTANCE_NUMBER, sys.flush_shared_pool_bx flush_result
  FROM v$instance)));  

Unfortunately I can not use the gv$ function as user berx directly.

So a view does the trick:
create view v_flush_shared_pool_bx
as 
SELECT * FROM TABLE(gv$(CURSOR(
  SELECT INSTANCE_NUMBER, sys.flush_shared_pool_bx flush_result
  FROM v$instance))); 

grant select on v_flush_shared_pool_bx to public;

create or replace
function f_gv_flush
return DBMS_DEBUG_VC2COLL
PIPELINED 
IS
  res varchar2(120);
   cursor c1 is
     SELECT INSTANCE_NUMBER || ' - ' || FLUSH_RESULT as in_result
 FROM TABLE(gv$(CURSOR(
  SELECT INSTANCE_NUMBER, sys.flush_shared_pool_bx flush_result
  FROM v$instance)));
begin
  FOR res_row in c1
  LOOP
      pipe row (to_char(res_row.in_result)) ;
  END LOOP;
  return;
end;
/

and now it's so simple to flush on all instances:

connect berx/

select * from sys.v_flush_shared_pool_bx;
  
INSTANCE_NUMBER FLUSH_RESULT
--------------- --------------------------------------------------
              1 Flush_Shared_Pool executed on 10-APR-17 by BERX
              3 Flush_Shared_Pool executed on 10-APR-17 by BERX
              2 Flush_Shared_Pool executed on 10-APR-17 by BERX


or

select * from table(sys.f_gv_flush);

COLUMN_VALUE
-----------------------------------------------------
1 - Flush_Shared_Pool executed on 10-APR-17 by BERX
2 - Flush_Shared_Pool executed on 10-APR-17 by BERX
3 - Flush_Shared_Pool executed on 10-APR-17 by BERX

So obviously the answer if I an flush shared pool in all RAC instances - even in Active DataGuard is: Yes.

But at the same time the answer is NO, as I'm using the gv$ function - and this is not documented, so it's not wise to use it.

If it was not about an Active DataGuard, I'd use a scheduler job to run the flush shared_pool on all instances, but as the ADG is read only, I can't use it.

If you have another idea, how to do this (only with oracle means, no cronjob or similar), please let me know!