Resilience testing with GMS application
Instantiation API call in flight
If the instantiation API call is in flight and the orchestrator goes down, the API call will get the following error:
API call can be reissued once the service comes backup and application instantiation should complete
Instantiation API call completes but status API fails
In this case the orchestrator went down after the API call succeeded, but before the status was updated, status API will return error, but the instantiation will succeed after the orchestrator comes back up.
Response Code: 202
API Response code 202. Waiting...
Since the orchestrator went down, status GET call fails with connection refused message:
Get "http://172.16.16.150:30415/v2/projects/gmsproj1/composite-apps/gms-collection-composite-app/v1/deployment-intent-groups/gms-collection-deployment-intent-group/status?status=deployed": dial tcp 172.16.16.150:30415: connect: connection refused
Connection refused message
If the API call is made after the EMCO microservice goes down, API call fails with connection refused message ("connect: connection refused"). API call needs to be reissued after the microservice comes backup
Example:
Post "http://172.16.16.150:30415/v2/projects/gmsproj1/composite-apps/gms-collection-composite-app/v1/deployment-intent-groups/gms-collection-deployment-intent-group/terminate": dial tcp 172.16.16.150:30415: connect: connection refused
Termination API call in flight
If the termination API call is in flight and the orchestrator goes down, the API call will get the following error:
API call can be reissued once the service comes backup and application termination should complete
Termination API call completes but status API fails
In this case the orchestrator went down after the API call succeeded, but before the status was updated, status API will return error, but the termination will succeed after the orchestrator comes back up.
POST --> URL: http://172.16.16.150:30415/v2/projects/gmsproj1/composite-apps/gms-collection-composite-app/v1/deployment-intent-groups/gms-collection-deployment-intent-group/terminate
Response Code: 202
API Response code 202. Waiting...
Since the orchestrator went down, status GET call fails with connection refused message:
Get "http://172.16.16.150:30415/v2/projects/gmsproj1/composite-apps/gms-collection-composite-app/v1/deployment-intent-groups/gms-collection-deployment-intent-group/status?status=deployed": dial tcp 172.16.16.150:30415: connect: connection refused
Logical cloud Instantiation API call completes but status API fails
In this case the dcm went down after the API call succeeded, but before the status was updated, status API will return error, but the instantiation of logical cloud will succeed after the dcm comes back up.
POST --> URL: http://172.16.16.150:30477/v2/projects/gmsproj1/logical-clouds/default/instantiate
Response Code: 202
API Response code 202. Waiting...
Since the dcm went down, status GET call fails with connection refused message:
Get " http://172.16.16.150:30477/v2/projects/gmsproj1/logical-clouds/default/status": dial tcp 172.16.16.150:30477: connect: connection refused
Logical cloud termination API call completes but status API fails
In this case the dcm went down after the API call succeeded, but before the status was updated, status API will return error, but the termination of logical cloud will succeed after the dcm comes back up.
POST --> URL: http://172.16.16.150:30477/v2/projects/gmsproj1/logical-clouds/default/terminate
Response Code: 202
API Response code 202. Waiting...
Since the dcm went down, status GET call fails with connection refused message:
Get " http://172.16.16.150:30477/v2/projects/gmsproj1/logical-clouds/default/status": dial tcp 172.16.16.150:30477: connect: connection refused
Logical cloud Termination API call in flight
If the termination API call is in flight and the dcm goes down, the API call will get the following error:
Post "http://172.16.16.150:30477/v2/projects/gmsproj1/logical-clouds/default/terminate": read tcp 172.16.16.1:56344->172.16.16.150:30477: read: connection reset by peer
Apply: projects/gmsproj1/logical-clouds/default/terminate Error: Post "http://172.16.16.150:30477/v2/projects/gmsproj1/logical-clouds/default/terminate": read tcp 172.16.16.1:56344->172.16.16.150:30477: read: connection reset by peer
Logical Cloud deletion in progress... Please Wait
parse error: Invalid numeric literal at line 1, column 6
Invalid delete status State.
API call can be reissued once the service comes backup and application termination should complete
Cluster provider creation with CLM restarting
In this case, cluster provider creation fails followed by logical cloud and deployment intent instantiation failure:
Post "http://172.16.16.150:30461/v2/cluster-providers": dial tcp 172.16.16.150:30461: connect: connection refused
Apply: cluster-providers Error: Post "http://172.16.16.150:30461/v2/cluster-providers": dial tcp 172.16.16.150:30461: connect: connection refused
logical cloud and deployment intent instantiation returns code 500
POST --> URL: http://172.16.16.150:30477/v2/projects/gmsproj1/logical-clouds/default/instantiate
Response Code: 500
Response: The server encountered an internal error and was unable to complete your request.
POST --> URL: http://172.16.16.150:30477/v2/projects/gmsproj1/logical-clouds/default/instantiate
Response Code: 500
Response: The server encountered an internal error and was unable to complete your request.
Once the CLM comes backup, reissue the provider, cluster, logical cloud, DIG instantiation calls and the application instantiation should complete