driver/docker: protect against nil container
Protect against a panic when we attempt to start a container with a name that conflicts with an existing one. If the existing one is being deleted while nomad first attempts to create the container, the createContainer will fail with `container already exists`, but we get nil container reference from the `containerByName` lookup, and cause a crash. I'm not certain how we get into the state, except for being very unlucky. I suspect that this case may be the result of a concurrent restart or the docker engine API not being fully consistent (e.g. an earlier call purged the container, but docker didn't free up resources yet to create a new container with the same name immediately yet). If that's the case, then re-attempting creation will hopefully succeed, or we'd at least fail enough times for the alloc to be rescheduled to another node.
This commit is contained in:
parent
eab600d3e1
commit
dff071c3b9
|
@ -439,16 +439,21 @@ CREATE:
|
|||
return container, nil
|
||||
}
|
||||
|
||||
// Delete matching containers
|
||||
err = client.RemoveContainer(docker.RemoveContainerOptions{
|
||||
ID: container.ID,
|
||||
Force: true,
|
||||
})
|
||||
if err != nil {
|
||||
d.logger.Error("failed to purge container", "container_id", container.ID)
|
||||
return nil, recoverableErrTimeouts(fmt.Errorf("Failed to purge container %s: %s", container.ID, err))
|
||||
} else {
|
||||
d.logger.Info("purged container", "container_id", container.ID)
|
||||
// Purge conflicting container if found.
|
||||
// If container is nil here, the conflicting container was
|
||||
// deleted in our check here, so retry again.
|
||||
if container != nil {
|
||||
// Delete matching containers
|
||||
err = client.RemoveContainer(docker.RemoveContainerOptions{
|
||||
ID: container.ID,
|
||||
Force: true,
|
||||
})
|
||||
if err != nil {
|
||||
d.logger.Error("failed to purge container", "container_id", container.ID)
|
||||
return nil, recoverableErrTimeouts(fmt.Errorf("Failed to purge container %s: %s", container.ID, err))
|
||||
} else {
|
||||
d.logger.Info("purged container", "container_id", container.ID)
|
||||
}
|
||||
}
|
||||
|
||||
if attempted < 5 {
|
||||
|
|
Loading…
Reference in a new issue