How to cancel a hung deployment?

Automated deployment for web applications and databases

How to cancel a hung deployment?

Postby isme » Wed Dec 18, 2013 7:35 pm

Normally my deployment takes about five minutes to complete.

The latest one has been running for 40 minutes with no end in sight.

The log goes no further than starting to execute a custom pre-deployment script:

Code: Select all
2013-12-18 17:48:16 +00:00 INFO   Looking for PowerShell scripts named PreDeploy.ps1
2013-12-18 17:48:16 +00:00 INFO   Calling PowerShell script: 'G:\\Temp\\p1piiqgl.csd\\Packages\\..\\Applications\\INT-CI\\Rhubarb.Rhubarb.Rhubarb\\1.0.8711.241\\db\\state\\Replication\\PreDeploy.ps1'


It turns out that the script contains a subtle infinite loop.

How do I cancel a runaway deployment?

My Deployment Manager version is v2.3.4.13. The host runs PowerShell v2.

EDIT: I removed the script I posted because it was another one that actually caused the problem. Pebkac!
Last edited by isme on Thu Dec 19, 2013 11:44 am, edited 2 times in total.
Iain Elder, Skyscanner
isme
 
Posts: 83
Joined: Tue Jun 12, 2012 1:49 pm
Location: Edinburgh

Postby isme » Wed Dec 18, 2013 9:57 pm

We abandoned the deployment by restarting the Windows host.

Is there a less disruptive way to do this?

Here's what we tried.

I used this command to restart the RGDM service.

Code: Select all
Restart-Service -InputObject (Get-Service -ComputerName 'redgatedeploy' -Name 'Red Gate Deployment Manager') -Confirm


Initially it looked like it worked.

Code: Select all
$ Get-Service -ComputerName 'redgatedeploy' -Name 'Red Gate Deployment Manager'

Status   Name               DisplayName                           
------   ----               -----------                           
Running  Red Gate Deploy... Red Gate Deployment Manager           



But a few seconds later, the service stopped again.

Code: Select all
$ Get-Service -ComputerName 'redgatedeploy' -Name 'Red Gate Deployment Manager'

Status   Name               DisplayName                           
------   ----               -----------                           
Stopped  Red Gate Deploy... Red Gate Deployment Manager           



At this point we figured the easiest thing to do was to restart the host.

Code: Select all
Restart-Computer -ComputerName 'redgatedeploy' -Force -Confirm


After a couple of minutes, the host responded to Get-Service requests again.

Code: Select all
$ Get-Service -ComputerName 'redgatedeploy' -Name 'Red Gate Deployment Manager'

Status   Name               DisplayName                           
------   ----               -----------                           
Stopped  Red Gate Deploy... Red Gate Deployment Manager           



We tried to start the service again.

Code: Select all
Start-Service -InputObject (Get-Service -ComputerName 'redgatedeploy' -Name 'Red Gate Deployment Manager') -Confirm


This time it stayed up.

Worryingly, there is no history of the bad deployment in the user interface.

The deployment history for my project says that the most recent deployment was successful.

Code: Select all
Version          Status         Date                              Environment     Deployed by
1.0.8711.241     Successful     18 December 2013 17:46 +00:00     INT-CI          svc_teamcity
Iain Elder, Skyscanner
isme
 
Posts: 83
Joined: Tue Jun 12, 2012 1:49 pm
Location: Edinburgh

Postby isme » Wed Dec 18, 2013 10:19 pm

RGDM has logged these events in the Windows Application log since I attempted to restart the service.

2013-12-18 18:47:14,783 [80] ERROR RedGate.Deploy.Server.Tasks.TaskRunner [(null)] - System.OperationCanceledException: The operation was canceled.

2013-12-18 18:47:15,221 [51] ERROR RedGate.Deploy.Server.Tasks.TaskRunner [(null)] - System.OperationCanceledException: The operation was canceled.

2013-12-18 18:47:21,471 [9] ERROR RedGate.Deploy.Shared.Startup.Host [(null)] - System.ServiceModel.AddressAlreadyInUseException: There is already a listener on IP endpoint 0.0.0.0:10302. Make sure that you are not trying to use this endpoint multiple times in your application and that there are no other applications listening on this endpoint. ---> System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted

2013-12-18 18:48:16,956 [5] ERROR RedGate.Deploy.Shared.Startup.CommandProcessor [(null)] - Command RedGate.Deploy.Agent.Commands.SingleShotDeploymentCommand failed
RedGate.Deploy.Shared.Startup.CommandException: Timed out executing Powershell script 'G:\\Temp\\p1piiqgl.csd\\Packages\\..\\Applications\\INT-CI\\Rhubarb.Rhubarb.Rhubarb\\1.0.8711.241\\db\\state\\Replication\\PreDeploy.ps1'. Powershell.exe process was forcibly terminated after 01:00:00

2013-12-18 18:48:16,893 [5] ERROR RedGate.Deploy.Agent.Services.Jobs.JobRunner [(null)] - Timed out executing Powershell script 'G:\\Temp\\p1piiqgl.csd\\Packages\\..\\Applications\\INT-CI\\Rhubarb.Rhubarb.Rhubarb\\1.0.8711.241\\db\\state\\Replication\\PreDeploy.ps1'. Powershell.exe process was forcibly terminated after 01:00:00

2013-12-18 18:50:05,645 [9] ERROR RedGate.Deploy.Shared.Startup.Host [(null)] - System.ServiceModel.AddressAlreadyInUseException: There is already a listener on IP endpoint 0.0.0.0:10302. Make sure that you are not trying to use this endpoint multiple times in your application and that there are no other applications listening on this endpoint. ---> System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted

2013-12-18 18:51:33,036 [9] ERROR RedGate.Deploy.Shared.Startup.Host [(null)] - System.ServiceModel.AddressAlreadyInUseException: There is already a listener on IP endpoint 0.0.0.0:10302. Make sure that you are not trying to use this endpoint multiple times in your application and that there are no other applications listening on this endpoint. ---> System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted

2013-12-18 19:08:59,075 [36] WARN RedGate.Deploy.Server.Tasks.TaskRunner [(null)] - Task deployments-5793 exited with error Deployment on the agent failed.

2013-12-18 19:08:58,887 [36] ERROR RedGate.Deploy.Server.Tasks.TaskRunner [(null)] - System.AggregateException: One or more errors occurred. ---> RedGate.Deploy.Server.Tasks.ActivityFailedException: Deployment on the agent failed.


I've omitted the stack trace.

Ask me if you want a complete copy of the filtered log.

To generate this, I connected the event view on my workstation to the RGDM host...

Code: Select all
mmc eventvwr.msc /computer:redgatedeploy


...and used this filter to show only the events from RGDM.

Code: Select all
<QueryList>
  <Query Id="0" Path="Application">
    <Select Path="Application">
      *[System[Provider[@Name='Red Gate Deployment Manager']]]
    </Select>
  </Query>
</QueryList>



I tried to restart the service twice before restarting the host, which could explain why System.ServiceModel.AddressAlreadyInUseException appears twice.
Iain Elder, Skyscanner
isme
 
Posts: 83
Joined: Tue Jun 12, 2012 1:49 pm
Location: Edinburgh

Postby Mike Upton » Thu Dec 19, 2013 2:56 pm

The default timeout period for PowerShell scripts is one hour. You can reduce that by setting the RedGatePowerShellTimeout variable. The value should be in the form hh:mm:ss. For example, for a five minute timeout, set RedGatePowerShellTimeout to 00:05:00.

By the way, the Red Gate Deployment Manager service (process name RedGate.Deploy.Server.exe) doesn't run the deployment directly; the Deployment Agent (RedGate.Deploy.Agent.exe) is responsible for actually executing the deployment. For standard deployments this is the Red Gate Deployment Agent service, normally running on a remote machine. For database deployments, the DM server spawns RedGate.Deploy.Agent.exe as a child process. If you need to terminate a running deployment, it's the agent process that you need to kill.

We don't currently have the ability to cancel deployments that are in progress; if you'd like us to implement that feature, please vote for it in UserVoice
Mike Upton

Project Manager - SQL Compare|Data Compare|Comparison SDK
Redgate Software Ltd.
Mike Upton
 
Posts: 189
Joined: Wed May 11, 2011 8:04 am
Location: Redgate

Postby isme » Thu Dec 19, 2013 6:41 pm

Thanks, Mike. I've shared this thread with my team. It's useful knowledge and practical advice.

The next time we encounter a runaway deployment, we'll try to kill just the agent processes rather than the server process.

We may decide to reduce the default PowerShell time limit for most of our deployments to defend against being blocked by infinite loops.

We expect some of our bulk data transfer operations to take more than one hour. In these situations it might make sense to increase the limit.

Thanks for your help!
Iain Elder, Skyscanner
isme
 
Posts: 83
Joined: Tue Jun 12, 2012 1:49 pm
Location: Edinburgh


Return to Deployment Manager

Who is online

Users browsing this forum: No registered users and 0 guests