Graceful shutdown
You can configure the graceful shutdown as described in Graceful shutdown.
Coordinators
As a default, coordinators have 15 minutes
to terminate gracefully.
The coordinator process will receive a SIGTERM
signal when Kubernetes wants to terminate the Pod.
After the graceful shutdown timeout runs out, and the process still didn’t exit, Kubernetes will issue a SIGKILL
signal.
When a coordinator gets restarted, all currently running queries will fail and cannot be recovered after the restart process is finished.
As of Trino version 442
this can not be prevented (e.g. by using multiple coordinators).
Workers
As a default, Coordinators have 60 minutes
to terminate gracefully.
Trino supports gracefully shutting down workers.
This operator always adds a PreStop
hook to gracefully shut them down.
No additional configuration is needed, this guide is intended for users that need to tweak this mechanism.
The default graceful shutdown period is 1
hour, but it can be configured as follows:
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
name: trino
spec:
# ...
workers:
config:
gracefulShutdownTimeout: 1h
roleGroups:
default:
replicas: 1
Implementation
Once a worker Pod is asked to terminate, the PreStop
hook is executed and the following timeline occurs:
-
The worker goes into
SHUTTING_DOWN
state. -
The worker sleeps for
30
seconds to ensure that the coordinator has noticed the shutdown and stops scheduling new tasks on the worker. -
The worker now waits till all tasks running on it complete. This will take as long as the longest running query takes.
-
The worker sleeps for
30
seconds to ensure that the coordinator has noticed that all tasks are complete -
The
PreStop
hook will never return, but the JVM will be shut down by the graceful shutdown mechanism. -
If the graceful shutdown doesn’t complete quick enough (e.g. a query runs longer than the graceful shutdown period), after
<graceful shutdown period> + 30s of step 2 + 30s of step 4 + 10s safety overhead
the Pod gets killed, regardless if it has shut down gracefully or not. This is achieved by settingterminationGracePeriodSeconds
on the worker Pods. Currently running queries on the worker will fail and cannot be recovered.
As of SDP version The TLS certificate lifetime can be configured using
|
Implications
All queries that take less than the minimal graceful shutdown period of all roleGroups (1
hour as a default) are guaranteed to not be disturbed by regular termination of Pods.
They can obviously still fail when, for example, a Kubernetes node dies or gets rebooted before it is fully drained.
Because of this, the operator automatically restricts the execution time of queries to the minimal graceful shutdown period of all roleGroups using the Trino configuration query.max-execution-time=3600s
.
This causes all queries that take longer than 1 hour to fail with the error message Query failed: Query exceeded the maximum execution time limit of 3600s.00s
.
In case you need to execute queries that take longer than the configured graceful shutdown period, you need to increase the query.max-execution-time
property as follows:
spec:
coordinators:
configOverrides:
config.properties:
query.max-execution-time: 24h
Please keep in mind, that queries taking longer than the graceful shutdown period are now subject to failure when a Trino worker gets shut down.
Running into this issue can be circumvented by using Fault-tolerant execution, which is not supported natively yet.
Until native support is added, you will have to use configOverrides
to enable it.
Authorization requirements
When you are not using OPA for authorization, the user admin is not allowed to gracefully shut down workers.
If you need graceful shutdown you need to use OPA or need to make sure admin is allowed to gracefully shut down workers (e.g. having you own authorizer or patching Trino).
|
In case you use OPA to authorize Trino requests, you need to make sure the user admin
is authorized to trigger a graceful shutdown of the workers.
You can achieve this e.g. by adding the following rule, which grants admin
the permissions to do anything - including graceful shutdown.
allow {
input.context.identity.user == "admin"
}
In case the user admin
does not have the permission to gracefully shut down a worker, the error message curl: (22) The requested URL returned error: 403 Forbidden
will be shown in the worker log and the worker will shut down immediately.
We plan to add CustomResources, so that you can define your Trino ACLs via Kubernetes objects. In this case the trino-operator will generate the rego-rules for you, and will add the needed rules for graceful shutdown for you. Until then, you need to grant the permission yourself. |