Restarting the servers appears to have resolved the problem. Note that we're still working on resolving the other (port 443, 993 connectivity) problem.
Expect to see an unscheduled migration announcement shortly.
Posted 8 days ago. Jan 09, 2019 - 13:57 UTC
During yesterday's deploy, port numbers 443 and 993 were broken, preventing devices not using the default port 31314 (i.e. behind a restrictive firewall) from connecting.
This morning, we reconfigured network interfaces to fix that, which caused agents to experience a brief network outage. This triggered certain agents to enter a fast-retry loop, which caused resource starvation for other agents.
We have reverted the network configuration change, and have restarted the affected servers, which has temporarily resolved the problem.
We will be partially rolling back yesterday's deploy until we better understand the problem. Expect to see an unscheduled migration announcement later today.
Posted 8 days ago. Jan 09, 2019 - 13:03 UTC
Outbound HTTP connections from agents are failing intermittently.
Posted 8 days ago. Jan 09, 2019 - 11:19 UTC
This incident affected: Developer Imp Servers and Production Imp Servers.