Pages

Wednesday, September 23, 2015

Workflow stuck in Pending State - AX 2009

There have been cases wherein the Workflows seem to be stuck at one point of evaluation. Mostly, the case is that the step is evaluated to false, and yet the WF does not seem to move ahead. Even after all the assignees have completed their task, the WF does not move to the next step. It remains in the same state for over a day. Sometimes, they are pending for weeks or more.

There are quite some scenarios wherein this happens:

WF Batch - Check if the batch is set to run at say hourly intervals, to ensure the records are processed on the same day.
WF Batch configuration - The WF batch is configured to run at Server (If it is client, reset it to Server)
WF Configuration is not active - Activate and default the correct version of the WF
Users Check - Check if all the users that are a part of the configuration are active in the system. If not, deactivate that version, reconfigure the WF without the users that are inactive and activate the new version.
Error Log - Check to see if all connectors and WF servers are up and running.

The aforementioned cases are known and are almost always the case for the pending records. There is however, another case, unknown to most, as to why the Workflows do not process - Old and pending WF records.

While troubleshooting, it appears that there is nothing wrong with the configurations or the setups. At first, it just seems to be hung up because of slow processing. But when the Tutorial Processor is tun, the WF moves just one more step, and then halts. No Work items are created and no Assignments are made. It is just stuck.!! The error log, however, shows threading and deadlock issues in the same time frame that the WF is hung up. No other logs are recorded.

A lot of pending Workflows - those that will not be processed further - are caused either because they are no longer part of the active configuration of the Workflow or because the records are no longer required. These records are set to Pending state and every time the WF engine runs, it picks up these records as those that need processing. This means that these un-wanted records keep interrupting with the new records causing an excess load on the engine.

This is an issue with AX 2009 WF. The pending records keep interfering with the new threads (for the newer WF) and since they cannot be processed, enter into a deadlock situation eventually. This scenario is not common and happens over a long period of time, when there are a substantial number of WF created in the system.

To this situation, there is a simple, yet complex resolution. Do note that the solution is to be applied only during system downtime, and when there is no pending tasks by any users in the system.

Resolution steps:

1- Cancel any WF pending for over a quarter and those that are not a part of the active WF configuration or those which no longer need to be a part of the WF as they are not needed any more.

2- Stop AOS

3- Delete AUC files: This is to ensure that any pending WF that might be cached are flushed. The system will rebuild the cache once the AOS is restarted and the users login to their systems. This step will also flush the user preferences and other user specific cache from the system. Also, auc files need to be cleared from all users.

4- Restart Dynamics Ax service (AOS)

5- Compile form AOT the entire tree: This will clear up any other issues, just in case

6- Synchronize data dictionary: To ensure that the schema and data are in sync and there are no discrepancies.

7- Test your workflow.

It might take some time till the WF are processed normally. Since the cache was cleared, the system will build up the cache from scratch, and it might be a while before any difference in processing is noticed. Usually, it takes at least 3-4 days to observe differences in processing speed.

It is advised to do the activity in the weekends, so that the system can build up the cache while the users are offline and there is no hindrance for the users as well.


To prevent such build up in the future, ensure to cancel any WF that are pending for long in the system, every quarter. This will ensure that there are no threads that are left hanging to interfere with the processing in the system.

Hope this helps :)

No comments:

Post a Comment