Wednesday 23 September 2015

PC Power Control with a Raspberry Pi and Maas

I recently decided to setup a small cluster of computers at home to be managed by Juju and MAAS. The computers are in the attic which meant that finger based power management was going to quickly lose its appeal. Many of my friends and colleagues have enviable home computer setups, with power control being done elegantly by iLO/LOM/Intel AMT or some such. None of my old tin boasted anything as grand as that. I could have used wake-on-lan as both MAAS and my old machines support it but it doesn't provide a reliable way to power machines off (they don't always have shell access) or to check their current power state (they may not be on the network). What I'd have loved to do was build a rebot but that would be costly and I'd have to design a bespoke rig for each piece of hardware since my old kit is not in neat NUC form. In the end I decided to get a Raspberry Pi Model B to do the job. Unfortunately, I know next to nothing about electronics and do not own a soldering iron so my solution had to be solder free.

Research

I found this blog (minus tampering with the power feed between power supply and mother board) and this blog (but extended to control multiple machines) both very helpful and they inspired the final design.

The Basic Design

ATX motherboards expose pins that a computer case uses to wire in the reset button, power button and power leds. I removed the connections to the case and instead wired the power and reset pins to relays controlled by the Pi and wired the pins controlling the power led into one of the PiFaces' Input ports.

The prototype had jumpers cabled directly from the pins to the Pi but I wanted to be able to unplug the computers from the Pi and have them some distance apart so I used Ethernet cables to connect the Pi to the computers pins. Essentially the solution looks like this...

The Raspberry Pi

I used a Raspberry Pi Model B with a Piface hat. I had the Piface in a drawer so I used that for the prototype as it comes with two relays.

Setting up the Relays

The relays on the Piface worked but I needed two relays per PC. To get more relays I bought an Andoer 5V Active Low 8 Channel Road Relay Module Control Board. Oh, and by the way, one RJ45 jack cost twice as much as the 8 relay Arduino board.

Connecting the relay board to the Pi was straight forward. I attached the ground pin to the Raspberry Pi ground and the 7 remaining pins to the 7 PiFace output pins. The relay also came with a jumper which I put across the VCC-JD and VCC pins for reasons.

At the PC End

The PC part of the puzzle was simple. I used jumper cables with a female end and used them to connect the motherboard pins to a depressingly expensive RJ45 jack. The connections on the RJ45 jack are numbered, below is how I wired them to the motherboard.

Colour Green Orange Red Yellow Blue Purple
ATX Pin Power Power Reset Reset LED - LED +
Ethernet Cable Number 1 3 6 2 5 4


Raspberry Pi End 

The ethernet cable from the PC was again terminated with a RJ45 jack whose connections were wired into the Pi as below:

Colour Green Orange Red Yellow Blue Purple
Raspberry Pi Location Relay 1 Relay 1 Relay 2 Relay 2 GPIO 1 Ground
Ethernet Cable Number 1 3 6 2 5 4


Finished Hardware



  • The breadboard in the picture was used for prototyping but was not used in the final design. 
  • The colours of the jumpers may not match those in the tables above because some got damaged and were replaced.
  • I had problems with the second relay so it was not used.

 

Software

Once everything was connected, MAAS needs a way to remotely control the relays and to check the state of the input pins. I began by writing a Python Library and then adding a Rest Interface that MAAS could call. Finally I installed gunicorn to run the server.

cd ~
apt-get install -y python-pifacecommon python-pifacedigitalio bzr gunicorn python-yaml
bzr branch lp:~gnuoy/+junk/pimaaspowersudo 
cp pimaaspower/pipower /etc/gunicorn.d/
sudo service gunicorn restart

MAAS

MAAS knows about a number of different power management interfaces and it was fairly straight forward to plug a new one in, although because it involves editing files managed by the MAAS package these changes will need reapplying after a package update. I believe that making power types more pluggable in MASS is in the pipeline.

  • Firstly add a new template to /etc/maas/templates/power/pipower.template content here.
  • Add an entry to JSON_POWER_TYPE_PARAMETERS in /usr/lib/python2.7/dist-packages/provisioningserver/power_schema.py
 
    {
        'name': 'pipower',
        'description': 'Pipower',
        'fields': [
            make_json_field('node_name', "Node Name"),
            make_json_field('power_address', "Power Address"),
            make_json_field('state_pin', "Power State Pin Number"),
            make_json_field('reset_relay', "Reset Relay Number"),
            make_json_field('power_relay', "Power Relay Number"),
        ],
    }
  • Tell maas that this powertype supports querying powerstate (unlike wake-on-lan). Edit /usr/lib/python2.7/dist-packages/provisioningserver/rpc/power.py and add 'pipower' to QUERY_POWER_TYPES
  •  sudo service maas-clusterd restart
  • Edit the nodes registered in MAAS


  1. Power Type: 'Pipower'. 
  2. Node name: Can be anything but it makes sense to use same node name that MAAS is using as it makes debugging easier.
  3. Power Address: http://<pipower ip>:8000/power/api/v1.0/computer
  4. Power state Pin: The number of the PiFace Input port for the power LED
  5. Reset Relay number: The number of the relay controlling the reset switch
  6. Power Relay number: The number of the relay controlling the power switch

Scaling

There are eight relays available and each computer to be managed uses two of them so the system will support four machines. However the reset relay is not strictly needed as MAAS never uses it which means you could support eight machines.

End Result

I can now have MAAS provision the machines without me being physically present. So, for example, I can use Juju to fire up a fully functioning Openstack cloud with two commands.

juju bootstrap
juju-deployer -c deploy.yaml trusty-kilo

In the background Juju will request as many physical machines as it needs from MAAS. MAAS with bring those machines online, install Ubuntu and hand them back to Juju. Juju then uses the Openstack charms to install and configure the Openstack services, all while I'm downstairs deciding which biscuit to have with my cup of tea. Whenever I'm finished I can tear down the environment.

juju destroy-environment maas

MAAS will power the machines off and they'll be put back in the pool ready for the next deployment.

NOTE: I adapted an existing bundle from the Juju charm store  to fit Openstack on my diminutive cluster. The bunde is here

Wednesday 8 July 2015

Neutron Router High Availability? As easy as "juju set"

The Juju Charms for deploying Openstack have just had their three monthly update (15.04 release). The charms now allow the new Neutron Layer 3 High Availability using Virtual Router Redundancy Protocol (VRRP) feature to be used. When enabled, this feature will allow Neutron to quickly failover a router to another Neutron gateway in the event that the primary node hosting the router is lost. The feature was introduced in Juno and marked as experimental so I would recommend only using it with deployments >= Kilo.

Enabling Router ha:


L3 HA in kilo requires that DVR and L2 Population are disabled, so to enable it in the charms:
juju set neutron-api enable-l3ha=True
juju set neutron-api enable-dvr=False
juju set neutron-api l2-population=False

The number of L3 agents that will run standby routers can also be configured:
juju set neutron-api max-l3-agents-per-router=2
juju set neutron-api min-l3-agents-per-router=2

Creating a HA enabled router.


The charms switch on router HA by default once enable-l3ha has been enabled.
$ neutron router-create ha-router
Created a new router:
+-----------------------+--------------------------------------+
| Field                 | Value                                |
+-----------------------+--------------------------------------+
| admin_state_up        | True                                 |
| distributed           | False                                |
| external_gateway_info |                                      |
| ha                    | True                                 |
| id                    | 64ff0665-5600-433c-b2d8-33509ce88eb1 |
| name                  | test2                                |
| routes                |                                      |
| status                | ACTIVE                               |
| tenant_id             | 8e8b1426508f42aeaff783180d7b2ef4     |
+-----------------------+--------------------------------------+
/!\ Currently a router cannot be switched in and out of HA mode
$ neutron router-update 64ff0665-5600-433c-b2d8-33509ce88eb1  --ha=False
400-{u'NeutronError': {u'message': u'Cannot update read-only attribute ha', u'type': u'HTTPBadRequest', u'detail': u''}}

Under the hood:

Below is a worked example, following the creation of an HA enabled router showing the components created implicitly by Neutron. In this environment the following networks have already been created:
$ neutron net-list
+--------------------------------------+---------+------------------------------------------------------+
| id                                   | name    | subnets                                              |
+--------------------------------------+---------+------------------------------------------------------+
| 32ba54bc-804e-489e-8903-b8dc0ed535f7 | private | a3ed1cc4-3451-418f-a412-80ad8cca2ec4 192.168.21.0/24 |
| c9a3bc24-6390-4220-b136-bc0edf1fe2f2 | ext_net | 76098d4d-bfa4-4f96-89e0-78c851d80dac 10.5.0.0/16     |
+--------------------------------------+---------+------------------------------------------------------+

$ neutron subnet-list 
+--------------------------------------+----------------+-----------------+----------------------------------------------------+
| id                                   | name           | cidr            | allocation_pools                                   |
+--------------------------------------+----------------+-----------------+----------------------------------------------------+
| 76098d4d-bfa4-4f96-89e0-78c851d80dac | ext_net_subnet | 10.5.0.0/16     | {"start": "10.5.150.0", "end": "10.5.200.254"}     |
| a3ed1cc4-3451-418f-a412-80ad8cca2ec4 | private_subnet | 192.168.21.0/24 | {"start": "192.168.21.2", "end": "192.168.21.254"} |
+--------------------------------------+----------------+-----------------+----------------------------------------------------+

In this environment there are three neutron-gateways:
$ juju status neutron-gateway --format=short

- neutron-gateway/0: 10.5.29.216 (started)
- neutron-gateway/1: 10.5.29.217 (started)
- neutron-gateway/2: 10.5.29.218 (started)

With their corresponding L3 agents:
$ neutron agent-list | grep "L3 agent"
| 28f227d8-e620-4478-ba36-856fb0409393 | L3 agent           | juju-lytrusty-machine-7  | :-)   | True           |
| 8d439f33-e4f8-4784-a617-5b3328bab9e3 | L3 agent           | juju-lytrusty-machine-6  | :-)   | True           |
| bdc00c2a-77c0-45c3-ab8a-ceca3319832d | L3 agent           | juju-lytrusty-machine-8  | :-)   | True           |

There is no router defined yet so the only network namespace present is the dhcp namespace for the private network:
$ juju run --service neutron-gateway --format=yaml "ip netns list"
- MachineId: "6"
  Stdout: ""
  UnitId: neutron-gateway/0
- MachineId: "7"
  Stdout: |
    qdhcp-32ba54bc-804e-489e-8903-b8dc0ed535f7
  UnitId: neutron-gateway/1
- MachineId: "8"
  Stdout: ""
  UnitId: neutron-gateway/2

Creating a router will add a qrouter-$ROUTERID netns to two of the gateway nodes (since min-l3-agents-per-router=2 and max-l3-agents-per-router=2)

$ neutron router-create ha-router
Created a new router:
+-----------------------+--------------------------------------+
| Field                 | Value                                |
+-----------------------+--------------------------------------+
| admin_state_up        | True                                 |
| distributed           | False                                |
| external_gateway_info |                                      |
| ha                    | True                                 |
| id                    | 192ba483-c060-4ee2-86ad-fe38ea280c93 |
| name                  | ha-router                            |
| routes                |                                      |
| status                | ACTIVE                               |
| tenant_id             | 8e8b1426508f42aeaff783180d7b2ef4     |
+-----------------------+--------------------------------------+

Neutron has assigned this router to two of the three agents:

$ ROUTER_ID="192ba483-c060-4ee2-86ad-fe38ea280c93"
$ neutron l3-agent-list-hosting-router $ROUTER_ID
+--------------------------------------+-------------------------+----------------+-------+
| id                                   | host                    | admin_state_up | alive |
+--------------------------------------+-------------------------+----------------+-------+
| 28f227d8-e620-4478-ba36-856fb0409393 | juju-lytrusty-machine-7 | True           | :-)   |
| bdc00c2a-77c0-45c3-ab8a-ceca3319832d | juju-lytrusty-machine-8 | True           | :-)   |
+--------------------------------------+-------------------------+----------------+-------+
A netns for the new router will have been created in neutron-gateway/1 and neutron-gateway/2:
$  juju run --service neutron-gateway --format=yaml "ip netns list"
- MachineId: "6"
  Stdout: ""
  UnitId: neutron-gateway/0
- MachineId: "7"
  Stdout: |
    qrouter-192ba483-c060-4ee2-86ad-fe38ea280c93
    qdhcp-32ba54bc-804e-489e-8903-b8dc0ed535f7
  UnitId: neutron-gateway/1
- MachineId: "8"
  Stdout: |
    qrouter-192ba483-c060-4ee2-86ad-fe38ea280c93
  UnitId: neutron-gateway/2
A keepalived process is spawned in each of the qrouter netns and these process communicate over a dedicated network which is created implicitly when the HA enabled router is added.
$ neutron net-list
+--------------------------------------+----------------------------------------------------+-------------------------------------------------------+
| id                                   | name                                               | subnets                                               |
+--------------------------------------+----------------------------------------------------+-------------------------------------------------------+
| 32ba54bc-804e-489e-8903-b8dc0ed535f7 | private                                            | a3ed1cc4-3451-418f-a412-80ad8cca2ec4 192.168.21.0/24  |
| af9cad57-b4fe-465d-b439-b72aaec16309 | HA network tenant 8e8b1426508f42aeaff783180d7b2ef4 | f0cb279b-36fe-43dc-a03b-8eb8b99e7f0b 169.254.192.0/18 |
| c9a3bc24-6390-4220-b136-bc0edf1fe2f2 | ext_net                                            | 76098d4d-bfa4-4f96-89e0-78c851d80dac 10.5.0.0/16      |
+--------------------------------------+----------------------------------------------------+-------------------------------------------------------+

Neutron creates a dedicated interface in the qrouter netns for this traffic.

$ neutron port-list
+--------------------------------------+-------------------------------------------------+-------------------+--------------------------------------------------------------------------------------+
| id                                   | name                                            | mac_address       | fixed_ips                                                                            |
+--------------------------------------+-------------------------------------------------+-------------------+--------------------------------------------------------------------------------------+
| 72326e9b-67e8-403a-80c3-4bac9748cdb6 |                                                 | fa:16:3e:aa:2c:96 | {"subnet_id": "a3ed1cc4-3451-418fa412-80ad8cca2ec4", "ip_address": "192.168.21.2"}   |
| 89c47030-f849-41ed-96e6-a36a3a696eeb | HA port tenant 8e8b1426508f42aeaff783180d7b2ef4 | fa:16:3e:d4:fc:a1 | {"subnet_id": "f0cb279b-36fe-43dca03b-8eb8b99e7f0b", "ip_address": "169.254.192.1"}  |
| 9ce2b6ac-9983-4ffd-ae97-6400682021c8 | HA port tenant 8e8b1426508f42aeaff783180d7b2ef4 | fa:16:3e:a5:76:e9 | {"subnet_id": "f0cb279b-36fe-43dca03b-8eb8b99e7f0b", "ip_address": "169.254.192.2"}  |
+--------------------------------------+-------------------------------------------------+-------------------+--------------------------------------------------------------------------------------+

$  juju run --unit neutron-gateway/1,neutron-gateway/2 --format=yaml "ip netns exec qrouter-$ROUTER_ID ip addr list | grep  ha-"
- MachineId: "7"
  Stdout: |
    2: ha-89c47030-f8:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
        inet 169.254.192.1/18 brd 169.254.255.255 scope global ha-89c47030-f8
  UnitId: neutron-gateway/1
- MachineId: "8"
  Stdout: |
    2: ha-9ce2b6ac-99:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
        inet 169.254.192.2/18 brd 169.254.255.255 scope global ha-9ce2b6ac-99
        inet 169.254.0.1/24 scope global ha-9ce2b6ac-99
  UnitId: neutron-gateway/2

Keepalived writes out state to /var/lib/neutron/ha_confs/$ROUTER_ID/state, this can be queried to find out who is currently the master:

$ juju run --unit  neutron-gateway/1,neutron-gateway/2 "cat /var/lib/neutron/ha_confs/$ROUTER_ID/state"
- MachineId: "7"
  Stdout: backup
  UnitId: neutron-gateway/1
- MachineId: "8"
  Stdout: master
  UnitId: neutron-gateway/2

Plugging the router into the networks:

$ neutron router-gateway-set $ROUTER_ID c9a3bc24-6390-4220-b136-bc0edf1fe2f2
Set gateway for router 192ba483-c060-4ee2-86ad-fe38ea280c93
$ neutron router-interface-add $ROUTER_ID a3ed1cc4-3451-418f-a412-80ad8cca2ec4
Added interface 4ffe673c-b528-4891-b9ec-3ebdcfc146e2 to router 192ba483-c060-4ee2-86ad-fe38ea280c93.

The router now has an IP on the private subnet which will be managed by keepalived:

$ neutron router-show $ROUTER_ID                            
+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                 | Value                                                                                                                                                                                  |
+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up        | True                                                                                                                                                                                   |
| distributed           | False                                                                                                                                                                                  |
| external_gateway_info | {"network_id": "c9a3bc24-6390-4220-b136-bc0edf1fe2f2", "enable_snat": true, "external_fixed_ips": [{"subnet_id": "76098d4d-bfa4-4f96-89e0-78c851d80dac", "ip_address": "10.5.150.0"}]} |
| ha                    | True                                                                                                                                                                                   |
| id                    | 192ba483-c060-4ee2-86ad-fe38ea280c93                                                                                                                                                   |
| name                  | ha-router                                                                                                                                                                              |
| routes                |                                                                                                                                                                                        |
| status                | ACTIVE                                                                                                                                                                                 |
| tenant_id             | 8e8b1426508f42aeaff783180d7b2ef4                                                                                                                                                       |
+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Since neutron-gateway/2 has been designated as the leader it will have the router ip (10.5.150.0) in its netns:
$ juju run --unit neutron-gateway/1,neutron-gateway/2 --format=yaml "ip netns exec qrouter-192ba483-c060-4ee2-86ad-fe38ea280c93 ip addr list | grep  10.5.150"
- MachineId: "7"
  ReturnCode: 1
  Stdout: ""
  UnitId: neutron-gateway/1
- MachineId: "8"
  Stdout: |2
        inet 10.5.150.0/16 scope global qg-288da587-97
  UnitId: neutron-gateway/2

$ ping -c2 10.5.150.0
PING 10.5.150.0 (10.5.150.0) 56(84) bytes of data.
64 bytes from 10.5.150.0: icmp_seq=1 ttl=64 time=0.756 ms
64 bytes from 10.5.150.0: icmp_seq=2 ttl=64 time=0.487 ms

--- 10.5.150.0 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.487/0.621/0.756/0.136 ms

Finally, shutting down neutron-gateway/2 will tigger the router ip to flip over to neutron-gateway/1:

$ juju run --unit neutron-gateway/2 "shutdown -h now"
$ juju run --unit  neutron-gateway/1 "cat /var/lib/neutron/ha_confs/$ROUTER_ID/state"
master
$ juju run --unit neutron-gateway/1 --format=yaml "ip netns exec qrouter-192ba483-c060-4ee2-86ad-fe38ea280c93 ip addr list | grep  10.5.150"
- MachineId: "7"
  Stdout: |2
        inet 10.5.150.0/16 scope global qg-288da587-97
  UnitId: neutron-gateway/1

$ ping -c2 10.5.150.0
PING 10.5.150.0 (10.5.150.0) 56(84) bytes of data.
64 bytes from 10.5.150.0: icmp_seq=1 ttl=64 time=0.359 ms
64 bytes from 10.5.150.0: icmp_seq=2 ttl=64 time=0.497 ms

--- 10.5.150.0 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.359/0.428/0.497/0.069 ms