lunes, 25 de agosto de 2014

IBM Integration Bus (IIB) MQ retry and requeue tests

In this post I discuse a basic configuration that can be used to implement flow retry and requeue using MQ and IIB 9 (previously known as WebSphere Message Broker).

1. Input MQ Queue is persistent.


2. Input "MQ Input" node is transactional.



3. Input MQ Queue's backout threshold value is greater than 0 and has a backout requeue configured.


The Backout threshold property is also known as BOTHRESH. The Backout requeue queue property is also known as BOQNAME.

You can also set those values using runmqsc. For example:

runmqsc XXX_QMGR
ALTER QLOCAL(XXX_DEFAULT_QUEUE) BOTHRESH(10)
DISPLAY queue(XXX_DEFAULT_QUEUE)

The Backout threshold value represents the "number of retries" (in each "retry", the message is requeue in the same input Queue). Once the Backout threshold is reached, the message is enqueued in the Backout requeue queue. If no Backout requeue queue is configured, the message is enqueued in the queue manager's Dead-letter queue. If you configure neither, the first time a message reaches the Backout threshold, the queue will keep accepting new messages, but they won't be dequeued by your flow; you'll see this message in your event log (here, in spanish):
( XXX_BRK.XXX_EG ) No hay definida ninguna cola de reposición en cola para restitución ni cola de mensajes no entregados (DLQ) para restituir un mensaje; nodo Input (de entrada) 'In.MQ Input' recibido de la cola ''XXX_DEFAULT_QUEUE''.   
El nodo de entrada 'In.MQ Input' ha recibido un mensaje procedente de la cola ''XXX_DEFAULT_QUEUE'', pero se ha visto obligado a intentar una restitución debido a la detección de una condición de anomalía interna asociado al mismo. El intermediario de mensajes ha intentado ahora propagar el mensaje al 'nombre de cola de reposición en cola para restitución' o la cola de mensajes no entregados, pero ninguna de estas colas existe. El flujo de mensajes dejará de procesar el mensaje y continuará restituyéndose periódicamente.  Esta situación puede hacer que se acumulen en la cola de entrada hasta que el mensaje en cuestión se elimine o se procese correctamente.   
Examine los mensajes anteriores y posiblemente el flujo de mensajes para determinar el motivo por el cual el mensaje se está restituyendo.  Si se define la cola de reposición en cola para restitución ('backoutRequeueQueue') o la cola de mensajes no entregados, el mensaje se propagará a una de dichas colas.  
When yo do configure a  Backout requeue queue, the message is as following:
( XXX_BRK.XXX_EG ) El mensaje se ha restituido a una cola; nodo 'In.MQ Input'.
El nodo 'In.MQ Input' ha recibido un mensaje que se había restituido previamente una o más veces debido a un proceso de error en la vía de acceso principal del flujo de mensajes. El terminal de anomalías (failure terminal) no está conectado, por lo que el intermediario de mensajes está transfiriendo el mensaje directamente a la cola de reposición en cola o a la cola de restituciones de mensajes no entregados asociada a este nodo. La cuenta de restituciones ('backoutCount') del MQMD del mensaje es ahora igual al umbral de restituciones ('backoutThreshold') definido para la cola de entrada de WebSphere MQ.
Examine los mensajes anteriores y el flujo de mensajes para determinar el motivo por el cual el mensaje se restituye.  Corrija esta situación si es posible.  Lleve a cabo todos los procesos de recuperación locales que sean necesarios.

4. When the flow wants to requeue and retry the current message, an unhandled exception has to be thrown in the flow and it has to reach the MQ Input node without being handled.

For example:

Compute:

Each time the message is "retried", the value of MQMD.BackoutCount is increased. You can easily determine if a message is a retry, checking if the value of MQMD.BackoutCount is greater than 0.



5. Delay

There's no configuration to specify a "retry delay" (reference). It can for example be implemented  executing a pause after checking if the value of MQMD.BackoutCount is greater than 0.

References:


IBM Integracion Bus (IIB) COMMIT, ROLLBACK and UNCOORDINATED tests

There aren't many references to the COMMIT, ROLLBACK and UNCOORDINATED keywords in the IBM Integration Bus (IIB; prevously known as WebSphere Message Broker) 9 documentation.

The main reference I have found is in the "The transactional model" section of the official documentation; in the current document version, in the "Database auxiliary transactions" sub-section, it states: "Use the ESQL COMMIT and ROLLBACK statements to commit and roll back auxiliary database transactions. Obtain operations outside the main transaction by specifying the UNCOORDINATED keyword on the individual database statements (for example, the INSERT and UPDATE statements)"; in the "Queue auxiliary transactions"  sub-section, it notes: "The COMMIT and ROLLBACK statements therefore operate only on databases.".

Although the keywords are listed in that sections, I haven't found more specific references (including syntax trees); doing trial and error tests, it can be seen that otherwise as indicated in the documentation, the  UNCOORDINATED keyword can't be specified on the individual database statements:




In the "Dynamic Database Name and Multiple Database Support" IBM presentation (for WebSphere Message Broker v6), the following information can be found: "Any uncommitted database operations that have not been committed by nodes with their transaction property set to COMMIT are automatically committed by the input node if the flow ends normally, and rolled back if there is an unhandled exception".

In an (IBM internal?) presentation (source 1, source 2) syntax trees and usage examples can be found; however, for example, it includes examples of UNCOORDINATED usage that you really can't use in the tool (see imagen above).

According to all this, to my understanding, it can be said you can use the COMMIT and ROLLBACK keywords in some way, to control UNCOORINDATED database transactions (you can mark a transaction as UNCOORDINATED in some way).

After doing several tests, I found the following:
  • The COMMIT and ROLLBACK keywords can be used alone or o accompanied by a specific database  reference (for example, just "COMMIT;" or "COMMIT Database.DSN1;").
  • The UNCOORDINATED keyword can't be used anywhere.
  • In a Compute node accessing databases, the negotations with the database referred in the "Data Source" node property make part of the flow's global (coordinated) transactionality; this transactionality is affected by alone COMMIT and ROLLBACK sentences.
    • You can reference the database referred in the "Data Source" node property in two ways (implicitly or explicitly); the transactionality behaviour for this Database is always the same, no matter which form you use:
      • Implicitly: for example "UPDATE Database.TableX..." or "COMMIT;".
      • Explicitly: for example "UPDATE Database.DSN1.TableX..". or "COMMIT Database.DSN1;", where DSN1 is the value of the "Data Source" property.
  • In a Compute node accessing databases, the negotations with databases other than the referred in the "Data Source" node property (for example INSERT INTO Database.DSNX...) behave as "auxiliary database transactions"; they are affected only by COMMIT or ROLLBACK statements accompanied by the specific database  reference (for example "COMMIT Database.DSNX;").

Test scenary:


Node MQ Input is transactional; it's queue is persistent (with a backout value greater than 0 and a backout queue).


The Compute node contains the following:


The first two sentence groups (X) make reference to the same DSN defined in the node's "Data Source" property (the first group does it implicitly and the second group does it explicitly).

The value of the "Transaction" Compute node property varies between the tests.

The "Compute ErrorCompute node executes the following or nothing, depending on the test.


Tests:

Test1: Compute node "Transaction" is "Automatic", COMMIT or ROLLBACK not invoked, flow fails (THROW USER EXCEPTION in "Compute Error" node). 

Result: no data is inserted in any of the databases.

Explanation: no matter the database nor the coordinated or uncoordinated nature, having no COMMIT sentences executed, due to the "Automatic" node transaction type, the flow, after failing, ROLLBACKs to all the pending transactions.




Test2: Compute node "Transaction" is "Automatic"COMMIT or ROLLBACK not invoked, flow doesn't fail (no THROW USER EXCEPTION in "Compute Error" node). 


Result: data is inserted in all databases.

Explanation: no matter the database nor the coordinated or uncoordinated nature, having no COMMIT sentences executed, due to the "Automatic" node transaction type, the flow, after successful finishing, COMMITs to all the pending transactions.




Test3: Compute node "Transaction" is "Commit"COMMIT or ROLLBACK not invoked, flow fails (THROW USER EXCEPTION in "Compute Error" node). 

Expected Result: data is inserted in Table X (TABLAX; Compute node's "Data Source"'s DSN) inmediatly after finishing Compute node execution (it doesn't matter if the entire flow finish with success o failure). No data is inserted in Table Y (TABLAY).

Explanation: the interaction with the main database (X; Compute node's "Data Source"'s DSN) are governed with the "normal semantics", that is, according the success or failure of the node execution (node, not flow, because of the "Commit" value of "Transaction"). The interactions with other databases are governed according this:

Fig X1.

Result: just when the Compute node finish its execution (even when the flow haven't finished), data is inserted in Table X (TABLAX):


In that moment, as expected, Table Y (TABLAY) is empty:


When the flow ends, Table Y (TABLAY) is still empty:



Test4: Compute node "Transaction" is "Commit"COMMIT or ROLLBACK not invoked, flow doesn't fail (no THROW USER EXCEPTION in "Compute Error" node). 



Expected Result: data is inserted in Table X (TABLAX; Compute node's "Data Source"'s DSN) inmediatly after finishing Compute node execution (it doesn't matter if the entire flow finish with success o failure); in that moment, no data is visible in Table Y (TABLAY). When the flow ends, data is inserted in Table Y (TABLAY).

Explanation: the same of test 3.

Result: just when the Compute node finish its execution (even when the flow haven't finished), data is inserted in Table X (TABLAX) (view test 3 image).


In that moment, as expected, Table Y (TABLAY) is empty:


When the flow ends, Table Y (TABLAY) has data:



Test5: Compute node "Transaction" is "Automatic"COMMIT invoked (alone, without DSN specification), flow fails (THROW USER EXCEPTION in "Compute Error" node). 


Expected Result: data is inserted in Table X (TABLAX; Compute node's "Data Source"'s DSN) inmediatly after COMMIT is executed (it doesn't matter if the node or the entire flow finish with success o failure). The COMMIT execution only affects DSN X. No data is inserted in Table Y (TABLAY).

Explanation: the interaction with the main database (X; Compute node's "Data Source"'s DSN) are governed with the "normal semantics", that is, according the success or failure of the node execution or can be affected with COMMIT or ROLLBACK sentences without DSN specification. The interactions with other databases are governed according the stated in FigX1.

Result: just when COMMIT is executed (even when the node haven't finished), data is inserted in Table X (TABLAX).




When the flow ends, the two tables keep with the same state (TABLAX with data, TABLAY empty).

Test6: Compute node "Transaction" is "Commit"COMMIT invoked (alone, without DSN specification), flow fails (THROW USER EXCEPTION in "Compute Error" node). 


The results and explanation are the same of Test 5.

Test7: Compute node "Transaction" is "Commit" or "Automatic"COMMIT invoked (alone, without DSN specification), flow doesn't fail (no THROW USER EXCEPTION in "Compute Error" node). 


Expected Result: data is inserted in Table X (TABLAX; Compute node's "Data Source"'s DSN) inmediatly after COMMIT is executed (it doesn't matter if the node or the entire flow finish with success o failure). The COMMIT execution only affects DSN X. When the flow ends, data is inserted in Table Y (TABLAY).

Explanation: the same as Test5.

Result: just when COMMIT is executed (even when the node haven't finished), data is inserted in Table X (TABLAX).




In that moment, as expected, Table Y (TABLAY) is empty:


When the flow ends, Table Y (TABLAY) has data:



Test8: Compute node "Transaction" is "Automatic"COMMIT invoked (with DSN specification for database Y), flow doesn't fails (no THROW USER EXCEPTION in "Compute Error" node). 


Expected Result: data is inserted in Table Y (TABLAY) inmediatly after COMMIT is executed (even if the node hasn't finished execution; it doesn't matter if the node or the entire flow finish with success o failure). The COMMIT execution only affects DSN Y. No data is inserted in Table X (TABLAX) because the flow fails.

Explanation: the interaction with the main database (X; Compute node's "Data Source"'s DSN) are governed with the "normal semantics", that is, according the success or failure of the node or flow execution or can be affected with COMMIT or ROLLBACK sentences without DSN specification; they are obviously not affected by COMMITs accompanied by a specific DSN reference. The interactions with other databases are governed according the stated in FigX1.

Result: just when COMMIT is executed (even when the node haven't finished), data is inserted in Table Y (TABLAY).




In that instant, TABLAX is empty.


When the flow ends, TABLAX is still empty.



Test9: Compute node "Transaction" is "Commit"COMMIT invoked (with DSN specification for database Y), flow fails (THROW USER EXCEPTION in "Compute Error" node). 


Expected Result: data is inserted in Table Y (TABLAY) inmediatly after COMMIT is executed (even if the node hasn't finished execution; it doesn't matter if the node or the entire flow finish with success o failure). The COMMIT execution only affects DSN Y. Once the node finishes its execution, data is inserted im Table X (TABLAX); it doesn't matter if the entire flow finish with success o failure.

Explanation: the same as Test8.

Result: just when COMMIT is executed (even when the node haven't finished), data is inserted in Table Y (TABLAY).





In that instant, TABLAX is empty.


When the node ends, TABLAX has data

.



Test10: Compute node "Transaction" is "Commit" or "Automatic"COMMIT invoked (with and without DSN specification for database Y), flow fails (THROW USER EXCEPTION in "Compute Error" node). 


Expected Result: data is inserted in Table Y (TABLAY) and Table X (TABLAX) inmediatly after COMMITs are executed (even if the node hasn't finished execution; it doesn't matter if the node or the entire flow finish with success o failure).

Explanation: the same as Test8.

Result: just when COMMITs are executed (even when the node haven't finished), data is inserted in Table X (TABLAX) and Table Y (TABLAY).











Test11: Compute node "Transaction" is "Commit" or "Automatic", ROLLBACK invoked (without DSN specification), flow doesn't fail (no THROW USER EXCEPTION in "Compute Error" node). 



Expected Result: data is inserted in Table Y (TABLAY) after the flow ends. Table X (TABLAX) is empty.

Explanation: the same as Test8.

Result: 



Regardless the node transactionality configuration, the data in Table Y (TABLAY) is only commited when the flow finalizes (not the node).


Test12: Compute node "Transaction" is "Commit" or "Automatic"ROLLBACK invoked (with and without DSN specification), flow doesn't fail (no THROW USER EXCEPTION in "Compute Error" node). 


Expected Result: both tables are empty when the flow fhinishes.

Explanation: the same as Test8.

Result: 




Test13: Compute node "Transaction" is "Automatic"ROLLBACK and COMMIT invoked (with and without DSN specification), flow doesn't fail (no THROW USER EXCEPTION in "Compute Error" node). 


Result: 




Test14: COMMIT and ROLLBACK placed in not permitted places (Compute node without Data Source)

The following error is generated (in spanish):

'[Microsoft][Administrador de controladores ODBC] No se encuentra el nombre del origen de datos y no se especificó ningún controlador predeterminado'