The new deferred service event feature is arriving in the 21.04 OpenStack charm release. This will allow an operator to stop services from being restarted in some of the charms. This means interruptions to the data plane can be tightly controlled.
Managing deferred service events
The deferred service event feature is off by default but can be enabled by updating the enable-auto-restarts charm config option.
$ juju config neutron-gateway enable-auto-restarts=False
Triggering a deferred service restart via a charm change
Changing the neutron-gateway charms 'debug' option causes the neutron.conf to be updated. In turn a change to the neutron.conf will trigger neutron services to be restarted. However, when auto restarts are disabled the charm updates the neutron.conf but does not restart the neutron services and lets the operator know, via the workload status, that a restart is needed.
$ juju config neutron-gateway debug=True $ juju status neutron-gateway Model Controller Cloud/Region Version SLA Timestamp zaza-cfafc581b686 gnuoy-serverstack-nons serverstack/serverstack 2.8.8 unsupported 10:02:19Z App Version Status Scale Charm Store Channel Rev OS Message neutron-gateway 15.3.2 active 1 neutron-gateway local 65 ubuntu Unit is ready. Services queued for restart: neutron-dhcp-agent, neutron-l3-agent, neutron-metadata-agent, neutron-metering-agent, neutron-openvswitch-agent Unit Workload Agent Machine Public address Ports Message neutron-gateway/0* active idle 5 172.20.0.37 Unit is ready. Services queued for restart: neutron-dhcp-agent, neutron-l3-agent, neutron-metadata-agent, neutron-metering-agent, neutron-openvswitch-agent Machine State DNS Inst id Series AZ Message 5 started 172.20.0.37 9cc5c808-9c85-4b23-aaca-ded6ba666d33 bionic nova ACTIVE
Triggering a deferred hook
There are some occasions when it is not safe for a hook to run at all if the charm is deferring events. For example if the rabbitmq-server charm were to switch from plain text mode to TLS. If the rabbit daemon is not restarted then it will continue to run without TLS. The clients obviously cannot be told to switch to TLS as they will no longer be able to connect. In this case it is not safe to update the rabbitmq config without restarting the service because the service may get restarted for an unexpected reason like a server restart. If an unexpected restart happens rabbit will flip to the new config and the clients with be left trying to talk plain text to a TLS only service. To avoid this the charm may defer running the entire hook. If this happens this will also be visible in the workload status message.
$ juju config neutron-openvswitch disable-mlockall=False $ juju status neutron-openvswitch/0 Model Controller Cloud/Region Version SLA Timestamp zaza-cfafc581b686 gnuoy-serverstack-nons serverstack/serverstack 2.8.8 unsupported 10:44:12Z App Version Status Scale Charm Store Channel Rev OS Message neutron-openvswitch 15.3.2 active 1 neutron-openvswitch charmstore 433 ubuntu Unit is ready. Hooks skipped due to disabled auto restarts: config-changed nova-compute 20.5.0 active 1 nova-compute charmstore 539 ubuntu Unit is ready Unit Workload Agent Machine Public address Ports Message nova-compute/0* active idle 7 172.20.0.6 Unit is ready neutron-openvswitch/0* active idle 172.20.0.6 Unit is ready. Hooks skipped due to disabled auto restarts: config-changed Machine State DNS Inst id Series AZ Message 7 started 172.20.0.6 f160add9-ec68-4658-9688-da6dc7cb8c44 bionic nova ACTIVE
Triggering a deferred service restart via package change
The charms also ensure that package updates do not trigger restarts of key services. This still applies when the package update happens outside of a charm hook or action. If the update does happen outside of the charm then the next update-status hook will spot that a restart is needed and display that in the workload status message.
$ juju run --unit neutron-gateway/0 "dpkg-reconfigure openvswitch-switch; ./hooks/update-status" active active active active active invoke-rc.d: policy-rc.d denied execution of restart. $ juju status neutron-gateway Model Controller Cloud/Region Version SLA Timestamp zaza-cfafc581b686 gnuoy-serverstack-nons serverstack/serverstack 2.8.8 unsupported 10:26:46Z App Version Status Scale Charm Store Channel Rev OS Message neutron-gateway 15.3.2 active 1 neutron-gateway local 65 ubuntu Unit is ready. Services queued for restart: openvswitch-switch Unit Workload Agent Machine Public address Ports Message neutron-gateway/0* active idle 5 172.20.0.37 Unit is ready. Services queued for restart: openvswitch-switch Machine State DNS Inst id Series AZ Message 5 started 172.20.0.37 9cc5c808-9c85-4b23-aaca-ded6ba666d33 bionic nova ACTIVE
Triggering a deferred service restart via OpenStack upgrade
Perhaps the most interesting scenario is actually an OpenStack upgrade. In this case the package update is triggered by updating the charms openstack-origin option. With deferred service updates enabled the long running upgrade will complete without interrupting access to guests:
$ juju run --unit neutron-gateway/0 "pgrep ovs-vswitchd; dpkg -l | grep neutron-common" 30718 ii neutron-common 2:15.3.2-0ubuntu1~cloud2 all Neutron is a virtual network service for Openstack - common $ juju config neutron-gateway openstack-origin cloud:bionic-train $ juju config neutron-gateway openstack-origin=cloud:bionic-ussuri $ juju run --unit neutron-gateway/0 "pgrep ovs-vswitchd; dpkg -l | grep neutron-common" 30718 ii neutron-common 2:16.3.0-0ubuntu3~cloud0 all Neutron is a virtual network service for Openstack - common $ juju status neutron-gateway/0 Model Controller Cloud/Region Version SLA Timestamp zaza-cfafc581b686 gnuoy-serverstack-nons serverstack/serverstack 2.8.8 unsupported 14:13:04Z App Version Status Scale Charm Store Channel Rev OS Message neutron-gateway 16.3.0 active 1 neutron-gateway local 65 ubuntu Unit is ready. Services queued for restart: neutron-dhcp-agent, neutron-dhcp-agent.service, neutron-l3-agent, neutron-l3-agent.service, neutron-metadata-agent, neutron-metadata-agent.service, neutron-metering-agent, neutron-metering-agent.service, neutron-openvswitch-agent, neutron-openvswitch-agent.service, openvswitch-switch Unit Workload Agent Machine Public address Ports Message neutron-gateway/0* active idle 5 172.20.0.37 Unit is ready. Services queued for restart: neutron-dhcp-agent, neutron-dhcp-agent.service, neutron-l3-agent, neutron-l3-agent.service, neutron-metadata-agent, neutron-metadata-agent.service, neutron-metering-agent, neutron-metering-agent.service, neutron-openvswitch-agent, neutron-openvswitch-agent.service, openvswitch-switch Machine State DNS Inst id Series AZ Message 5 started 172.20.0.37 9cc5c808-9c85-4b23-aaca-ded6ba666d33 bionic nova ACTIVE
Running a service restart
The charms provide a restart-services action which accepts a deferred-only option. When the charm is run with deferred-only=True the charm will check which services are in need of a restart and restart them. For example to clear all deferred restarts:
$ juju run-action neutron-gateway/0 restart-services deferred-only=True --wait unit-neutron-gateway-0: UnitId: neutron-gateway/0 id: "238" results: Stdout: | active active active active active status: completed timing: completed: 2021-04-23 10:07:19 +0000 UTC enqueued: 2021-04-23 10:06:42 +0000 UTC started: 2021-04-23 10:06:45 +0000 UTC
Note: If a service is restarted manually then the charms workload status message will be updated when the next hook runs.
Running a deferred hook
The charms provide a run-deferred-hooks action which will run any hooks which have been deferred. Any service restarts that are marked as deferred will be restated as part of running this action.
$ juju run-action neutron-openvswitch/0 run-deferred-hooks --wait
Showing details of deferred events
The charms provide a show-deferred-events action. This will list the events that have been deferred with some extra detail.
$ juju run-action neutron-gateway/0 show-deferred-events --wait; unit-neutron-gateway-0: UnitId: neutron-gateway/0 id: "256" results: output: | hooks: [] restarts: - 1619173568 openvswitch-switch Package update - 1619175335 openvswitch-switch Package update - '1619181884 neutron-dhcp-agent File(s) changed: /etc/neutron/dhcp_agent.ini, /etc/neutron/neutron.conf' - '1619181884 neutron-l3-agent File(s) changed: /etc/neutron/neutron.conf' - '1619181884 neutron-metadata-agent File(s) changed: /etc/neutron/neutron.conf' - '1619181884 neutron-metering-agent File(s) changed: /etc/neutron/neutron.conf' - '1619181884 neutron-openvswitch-agent File(s) changed: /etc/neutron/neutron.conf' status: completed timing: completed: 2021-04-23 12:44:57 +0000 UTC enqueued: 2021-04-23 12:44:56 +0000 UTC started: 2021-04-23 12:44:56 +0000 UTC
Under the hood
Recording deferred events
When a charm or package needs to restart a service but cannot this is recorded in a file in /var/lib/policy-rc.d. These files have the following format:
# cat /var/lib/policy-rc.d/charm-neutron-gateway-6df8252a-a422-11eb-a3e0-fa163e25ff5d.deferred { action: restart, policy_requestor_name: neutron-gateway, policy_requestor_type: charm, reason: Package update, service: openvswitch-switch, timestamp: 1619175335}
This shows that the deferred action was a restart against the openvswitch-switch service. The timestamp the request was made is in seconds since the epoch and can be converted using the date command:
$ date -d @1619175335 Fri 23 Apr 11:55:35 BST 2021
The file also shows that the restart was requested because a package was updated. Finally the policy_requestor_name and policy_requestor_type keys show that the neutron-gateway charm is requesting that restarts of the service are denied.
These files are read by the update-status hook. The charm checks the timestamp in the file against the start time of the service. If the service was restarted after the timestamp in the file the file is removed and that deferred event is considered to be complete. Otherwise the events are summarised in the workload status message.
This means that deferred restarts can be cleared by restarting the service manually, removing the deferred event file or by running the restart-service action mentioned earlier.
Integration with packaging
The charm makes use of the policy-rc.d interface . When a package wishes to interact with a service it runs /usr/sbin/policy-rc.d with the name of the service and the action it wishes to take. The return code of the script tells the packaging system whether the restart was permitted or not. The charm ships its own implementation of the policy-rc.d script. This script decides whether a restart is permitted by examining policy files in /etc/policy-rc.d. These policy files list which actions against which services are denied.
# cat /etc/policy-rc.d/charm-neutron-gateway.policy
# Managed by juju blocked_actions: neutron-dhcp-agent: [restart, stop, try-restart] neutron-l3-agent: [restart, stop, try-restart] neutron-metadata-agent: [restart, stop, try-restart] neutron-metering-agent: [restart, stop, try-restart] neutron-openvswitch-agent: [restart, stop, try-restart] openvswitch-switch: [restart, stop, try-restart] ovs-vswitchd: [restart, stop, try-restart] ovsdb-server: [restart, stop, try-restart] policy_requestor_name: neutron-gateway policy_requestor_type: charm
The charm that wrote the policy file is indicated by the policy_requestor_name key and the blocked_actions key lists which actions are blocked for each service.
No comments:
Post a Comment