In context of Automation Server, request is set of Provisioning commands bound together in one transaction. In case of any failure, Automation Server reverts results of commands executed within request by the moment of failure. However, it happens that rollback fails as well. In cases like this, system administrator's intervention is required.

Symptoms

  • Website has status 'pending' for a long time (more then a 2 minutes is enough to doubt)
  • Customer's add/edit/delete actions trough Hosting Control Panel are failing with error like "Service SomeService is already involved in request with id: ae842739-d8e6-4cFa-b481-0d58a79b7667"
Careful!

First symptom can also indicate that Widows service Atomia Automation Server Provisioning Engine is not started. Check this first before conclusions.

Reason

Reason for request failure can not be specified. This is because Provisioning commands within one request are basically module commands, so any exception raised by any of a more then 20 Atomia Provisioning Modules or by any third party module, can cause request failure.

Solution

First, you need to determine request id. Sometimes you get it from exception message, like in second symptom. But if not, like in first, there is an easy way to do so. Execute following query on atsql01 server, AtomiaProvisioning2 database:

1
2
3
4
5
6
SELECT rid, account_id, request_status, date_time
FROM provisioning_request
WHERE account_id = '123456'
AND request_status = 'FailedRollbackError'
ORDER BY date_time DESC
-- Make sure to replace account_id from line 3 with actual account id

Usually, there should be just one row as a result. If there are more, just apply following steps on each of them.

For purpose of this guide, let's assume that there is one resulting row. Information that we ware looking for is request id in rid column. Write it down somewhere, you will need it multiple times by the end of this guide.

In following query, replace example request id with one you have, and execute it:

1
2
3
4
5
6
SELECT pj.date_time, rpc.resource_cmd_id, rpc.command, rpc.status command_status, pj.status journal_status, pj.message
FROM provisioning_journal pj
INNER JOIN resource_provisioning_commands rpc ON pj.fk_cmd_id = rpc.resource_cmd_id
WHERE rpc.fk_rid='ae842739-d8e6-4cFa-b481-0d58a79b7667'
ORDER BY pj.id
-- Make sure to replace fk_rid from line 4 with actual request id

Result of this query will show you individual commands within one request ordered by execution time.

Go down trough journal_status column to row with value 'Failed'. This is the point of failure for our request. Fix the error described in message column (I will not explain how to fix this error, since there could be any of countless number of errors. If you do not know how to solve it, contact technical support for help).

Continue down trough journal_status column to row with value 'RollBackFailed'. This is the point of rollback failure for our request. Fix the error described in message column (I will not explain how to fix this error, since there could be any of countless number of errors. If you do not know how to solve it, contact technical support for help). Write down value from resource_cmd_id column in this row, you will need it.

Now that both reasons for failure are fixed. we should continue request. There is a feature that provides this in Automation Server Client, but before that, there are several preparation steps. Execute following query:

1
2
3
4
5
6
7
8
9
SELECT rpc.resource_cmd_id cmd_id, rpc.command, ps.name service_name, rpc.status cmd_status, ps.status physical_service_status, ls.status logical_service_status, p.param_name, p.param_value, sspa.date_time, ps.service_id
FROM resource_provisioning_commands rpc
 INNER JOIN simple_service_provisioning_actions sspa ON rpc.resource_cmd_id = sspa.fk_resource_cmd_id
 INNER JOIN service_logical_struct ls ON sspa.fk_logical_service_id = ls.lid
 INNER JOIN ucp_services ps ON ls.fk_sid = ps.service_id
 INNER JOIN params p ON ps.service_id = p.fk_sid
WHERE rpc.fk_rid = '32BEBD7C-F71C-4B3B-B3A0-9DBCA5C134E0'
ORDER BY rpc.resource_cmd_id
-- Make sure to replace fk_rid from line 7 with actual request id

Result of this query shows details of services that are involved in request. Check if there is some with value "Dirty" in physical_service_status and/or logical_service_status columns (write down service_id and cmd_id values). What need to be done in this point is to update this statuses to real ones and to update command status to appropriate one.

Based on service_nameparam_value and param_name, it is easy to identify actual service on resource machine. Once you determine actual service status, update it's physical and logical status accordingly.

 

1
2
3
4
5
6
7
8
9
10
11
UPDATE service_logical_struct SET
status = 'Status' -- Actual status can be 'Nonexistent' or 'Ok'
WHERE fk_sid IN ('0D0DA509-CB45-42D2-9852-992AB7D3F70C')
 AND status = 'Dirty'
 
UPDATE ucp_services SET
status = 'Status' -- Actual status can be 'Nonexistent' or 'Ok'
WHERE service_id IN ('0D0DA509-CB45-42D2-9852-992AB7D3F70C')
 AND status = 'Dirty'
-- Make sure to replace fk_sid and service_id with actual service id.
-- Transactions usage is recommended