~tore Tore Anderson's technology blog

IPv6 support in the PlayStation 4

The other day, I noticed with great interest that my PlayStation 4 was using IPv6 to communicate with the Internet. I’m fairly certain that this behaviour is new, so I decided to investigate.

This is what appeared on the wire when it connected to the network:

  1 0.000000000           :: -> ff02::16     ICMPv6 110 Multicast Listener Report Message v2
  2 0.072956000           :: -> ff02::1:ffe2:19c7 ICMPv6 78 Neighbor Solicitation for fe80::2d9:d1ff:fee2:19c7
  3 0.799982000           :: -> ff02::16     ICMPv6 90 Multicast Listener Report Message v2
  4 1.600965000 fe80::2d9:d1ff:fee2:19c7 -> ff02::16     ICMPv6 90 Multicast Listener Report Message v2
  5 2.957012000 fe80::2d9:d1ff:fee2:19c7 -> ff02::2      ICMPv6 70 Router Solicitation from 00:d9:d1:e2:19:c7
  6 2.970763000 fe80::385a:20ff:fe70:f441 -> fe80::2d9:d1ff:fee2:19c7 ICMPv6 270 Router Advertisement from 3a:5a:20:70:f4:41
  7 2.971328000 fe80::2d9:d1ff:fee2:19c7 -> ff02::1:2    DHCPv6 110 Solicit XID: 0xe0e8c5 CID: 0003000100d9d1e219c7
  8 2.973796000 fe80::385a:20ff:fe70:f441 -> fe80::2d9:d1ff:fee2:19c7 DHCPv6 191 Advertise XID: 0xe0e8c5 CID: 0003000100d9d1e219c7 IAA: 2a02:fe0:c071:f00a::f1e
  9 2.974148000 fe80::2d9:d1ff:fee2:19c7 -> ff02::1:2    DHCPv6 152 Request XID: 0xe0e8c5 IAA: 2a02:fe0:c071:f00a::f1e CID: 0003000100d9d1e219c7
 10 2.977070000 fe80::385a:20ff:fe70:f441 -> fe80::2d9:d1ff:fee2:19c7 DHCPv6 223 Reply XID: 0xe0e8c5 CID: 0003000100d9d1e219c7 IAA: 2a02:fe0:c071:f00a::f1e
 11 2.977472000           :: -> ff02::1:ff00:f1e ICMPv6 78 Neighbor Solicitation for 2a02:fe0:c071:f00a::f1e
 12 3.000971000 fe80::2d9:d1ff:fee2:19c7 -> ff02::16     ICMPv6 90 Multicast Listener Report Message v2
 13 3.400970000 fe80::2d9:d1ff:fee2:19c7 -> ff02::16     ICMPv6 90 Multicast Listener Report Message v2
 14 3.977343000 fe80::2d9:d1ff:fee2:19c7 -> ff02::1:ff70:f441 ICMPv6 86 Neighbor Solicitation for fe80::385a:20ff:fe70:f441 from 00:d9:d1:e2:19:c7
 15 3.977615000 fe80::385a:20ff:fe70:f441 -> fe80::2d9:d1ff:fee2:19c7 ICMPv6 86 Neighbor Advertisement fe80::385a:20ff:fe70:f441 (rtr, sol, ovr) is at 3a:5a:20:70:f4:41
 16 3.977874000 2a02:fe0:c071:f00a::f1e -> 2a02:fe0:1:2:1:0:1:110 DNS 103 Standard query 0xc4e3  AAAA ena.net.playstation.net
 17 3.987868000 2a02:fe0:1:2:1:0:1:110 -> 2a02:fe0:c071:f00a::f1e DNS 241 Standard query response 0xc4e3  CNAME ena.net.playstation.net.edgekey.net CNAME e4963.dscg.akamaiedge.net AAAA 2a02:26f0:ac:181::1363 AAAA 2a02:26f0:ac:197::1363
 18 3.988383000 2a02:fe0:c071:f00a::f1e -> 2a02:26f0:ac:181::1363 TCP 94 62420→80 [SYN] Seq=0 Win=65535 Len=0 MSS=1440 WS=64 SACK_PERM=1 TSval=415148157 TSecr=0
 19 4.005888000 2a02:26f0:ac:181::1363 -> 2a02:fe0:c071:f00a::f1e TCP 94 80→62420 [SYN, ACK] Seq=0 Ack=1 Win=28560 Len=0 MSS=1440 SACK_PERM=1 TSval=3194590031 TSecr=415148157 WS=32
 20 4.006231000 2a02:fe0:c071:f00a::f1e -> 2a02:26f0:ac:181::1363 TCP 86 62420→80 [ACK] Seq=1 Ack=1 Win=65664 Len=0 TSval=415148175 TSecr=3194590031
 21 4.006361000 2a02:fe0:c071:f00a::f1e -> 2a02:26f0:ac:181::1363 HTTP 166 GET /netstart/ps4 HTTP/1.1
 22 4.021963000 2a02:26f0:ac:181::1363 -> 2a02:fe0:c071:f00a::f1e TCP 86 80→62420 [ACK] Seq=1 Ack=81 Win=28576 Len=0 TSval=3194590047 TSecr=415148175
 23 4.022418000 2a02:26f0:ac:181::1363 -> 2a02:fe0:c071:f00a::f1e HTTP 587 HTTP/1.1 403 Forbidden  (text/html)
 24 4.022479000 2a02:26f0:ac:181::1363 -> 2a02:fe0:c071:f00a::f1e TCP 86 80→62420 [FIN, ACK] Seq=502 Ack=81 Win=28576 Len=0 TSval=3194590048 TSecr=415148175
 25 4.022780000 2a02:fe0:c071:f00a::f1e -> 2a02:26f0:ac:181::1363 TCP 86 62420→80 [ACK] Seq=81 Ack=503 Win=65152 Len=0 TSval=415148191 TSecr=3194590048
 26 4.022849000 2a02:fe0:c071:f00a::f1e -> 2a02:26f0:ac:181::1363 TCP 86 62420→80 [FIN, ACK] Seq=81 Ack=503 Win=65664 Len=0 TSval=415148191 TSecr=3194590048
 27 4.037492000 2a02:26f0:ac:181::1363 -> 2a02:fe0:c071:f00a::f1e TCP 86 80→62420 [ACK] Seq=503 Ack=82 Win=28576 Len=0 TSval=3194590063 TSecr=415148191
 28 4.045960000 2a02:26f0:ac:181::1363 -> 2a02:fe0:c071:f00a::f1e TCP 86 [TCP Dup ACK 27#1] 80→62420 [ACK] Seq=503 Ack=82 Win=28576 Len=0 TSval=3194590071 TSecr=415148191
 29 4.046281000 2a02:fe0:c071:f00a::f1e -> 2a02:26f0:ac:181::1363 TCP 74 62420→80 [RST] Seq=82 Win=0 Len=0

There are several things I find noteworthy here:

  1. It supports DHCPv6. Since the DHCPv6 client runs in user space, this strongly indicates that it’s a deliberate move by Sony.
  2. It performs DNS requests over IPv6. A stub resolver also runs in user space, so it’s another indication that this is not accidental.
  3. It uses IPv6 to call home to the dual-stacked URL http://ena.net.playstation.net/netstart/ps4.
  4. The call home URL returns a 403 Forbidden error. However, it does so when accessed using IPv4 as well, so this might not mean much.

For the record, the call home request does not include any personal information beyond the source IP address and a URL indicating it’s a PS4. That said, the request itself is more than enough for Sony to generate useful statistics on how many PS4s with IPv6 Internet access there are out there. The following is the complete call home request made:

GET /netstart/ps4 HTTP/1.1
Connection: close
Host: ena.net.playstation.net

So far I’ve not seen it use IPv6 for anything else than what I’ve described above. An application like Netflix, which ought to use IPv6 whenever possible, does not. It would appear, therefore, that this is just small beginnings, perhaps done primarily to gather statistics. Nevertheless, I am very excited to see that Sony has begun work on implementing IPv6 support for the PS4.

Technical details

I first noticed the IPv6 capability after upgrading to system software version 3.50. I can’t rule out that it showed up in an earlier update, though, since I haven’t actively looked for it after installing earlier updates.

I tested various different network environments to figure out what exactly the PS4 supports. It would appear that Sony has done a thorough job:

  • It supports assignment of global IPv6 addresses using both SLAAC and DHCPv6 IA_NA. When using SLAAC, the Interface Identifier appears to be randomly generated. That is, the IID does not embed the PS4’s MAC address, and it changes every time the PS4 reconnects to the network.
  • It will learn IPv6 DNS servers from both the Recursive DNS Server RA Option and DHCPv6.
  • Addresses and/or DNS servers learned from DHCPv6 are preferred over those learned from ICMPv6 Router Advertisements (if any).
  • It will start a DHCPv6 client only if either the Managed or OtherConfig RA flag is set. If Managed=1, it will solicit both IA_NA and DNS configuration; otherwise, if OtherConfig=1, it will send a DHCPv6 Information-request message to obtain DNS configuration only.

I did find a couple of bugs too:

  • It would sometimes attempt to use its link-local address to communicate with the DNS server or the HTTP call-home web server, which doesn’t work. This suggests that there is a bug in the PS4’s default address selection logic, or that it failed to activate its SLAAC- or DHCPv6-assigned address. Simply re-connecting to the network would usually resolve this issue.
  • If address assignment is SLAAC-only, and the advertised prefix is off-link, no IPv6 Internet traffic is seen. In this case, the PS4 does not even start the DHCPv6 client even though OtherConfig=1. This is clearly a bug; there’s no reason why SLAAC can’t work perfectly well with off-link prefixes.

The next time I get a system software update, I’ll make sure to re-do all these tests and report any changes in a new post.

IPv6-only data centre RFCs published

I’m very pleased to report that my SIIT-DC RFCs were published by the IETF last week. If you’re interested in learning how to operate an IPv6-only data centre while ensuring that IPv4-only Internet users will remain able to access the services hosted in it, you should really check them out.

Start out with Stateless IP/ICMP Translation for IPv6 Data Center Environments (RFC 7755). This document describes the core functionality of SIIT-DC and the reasons why it was conceived.

If you think that you can’t possibly make your data centre IPv6-only yet because you still need to support few legacy IPv4-only applications or devices, continue with RFC 7756. This document describes how the basic SIIT-DC architecture can be extended to support IPv4-only applications and devices, allowing them to live happily in an otherwise IPv6-only network.

The third and final document is Explicit Address Mappings for Stateless IP/ICMP Translation (RFC 7757). This extends the previously existing SIIT protocol, making it flexible enough to support SIIT-DC. This extension is not specific to SIIT-DC; other IPv6 transition technologies such as 464XLAT and IVI also make use of it. Unless you’re implementing an IPv4/IPv6 translation device, you can safely skip RFC 7757. That said, if you want a deeper understanding on how SIIT-DC works, I recommend you take the time to read RFC 7757 too.

So what is SIIT-DC, exactly?

SIIT-DC is a novel approach to the IPv6 transition that we’ve developed here at Redpill Linpro. It facilitates the use of IPv6-only data centre environments in the transition period where a significant portion of the Internet remains IPv4-only. One could quite accurately say that SIIT-DC delivers «IPv4-as-a-Service» for data centre operators.

In a nutshell, SIIT-DC works like this: when an IPv4 packet is sent to a service hosted in a data centre (such as a web site), that packet is intercepted by a device called an SIIT-DC Border Relay (BR) as soon as it reaches the data centre. The BR translates the IPv4 packet to IPv6, after which it is forwarded to the IPv6 web server just like any other IPv6 packet. The server’s reply gets routed back to a BR, where it is translated from IPv6 to IPv4, and forwarded through the IPv4 Internet back to the client. Neither the client nor the server need to know that translation between IPv4 and IPv6 is taking place; the IPv4 client thinks it’s talking to a regular IPv4 server, while the IPv6 server thinks it’s talking to a regular IPv6 client.

There are several reasons why an operator might find SIIT-DC an appealing approach. In no particular order:

  • It facilitates IPv6 deployment without accumulation of IPv4 technical debt. The operator can simply switch from IPv4 to IPv6, rather than committing to operate IPv6 in parallel with IPv4 for the unforseeable future (i.e., dual stack). This greatly reduces complexity and operational overhead.
  • It doesn’t require the native IPv6 infrastructure to be built in a certain way. Any IPv6 network is compatible with SIIT-DC. It does not touch native IPv6 traffic from IPv6-enabled users. This means that when the IPv4 protocol eventually falls into disuse, no migration project will be necessary - SIIT-DC can be safely removed without any impact to the IPv6 infrastructure.
  • It maximises the utilisation of the operator’s public IPv4 addresses. If all the operator has available is a /24, every single of those 256 addresses can be used to provide Internet-facing services and applications. No addresses go to waste due to them being assigned to routers or backend servers (which do not need to communicate with the public Internet). It is no longer necessary to waste addresses by rounding up IPv4 LAN prefix sizes to the nearest power of two. Never again will it be necessary to expand a server LAN prefix, as it will be IPv6-only and thus practically infinitely large.
  • Unlike IPv4 NAT, it is completely stateless. Therefore, it scales in the same way as a standard IP router: the only metrics that matter are packets-per-second and bits-per-second. Its stateless nature makes it trivial to deploy; the BRs can be located anywhere in the IPv6 network. It is possible to spread the load between multiple BRs using standard techniques such as anycast or ECMP. High availability and redundancy are easily accomplished with the use of standard IP routing protocols.
  • Unlike some kinds of IPv4 NAT, it doesn’t hide the source address of IPv4 users. Thus, the IPv6-only application servers remain able to perform tasks which depend on the client’s source address, such as geo-location or abuse logging.
  • It allows for IPv4-only applications or devices to be hosted in an otherwise IPv6-only data centre. This is accomplished through an optional component called a SIIT-DC Edge Relay. This is what is being decribed in RFC 7756.

The history of SIIT-DC

I think it was around the year 2008 that it dawned on me that Redpill Linpro’s IPv4 resources would not last forever. At some point in the future we would inevitably be prevented from expanding our infrastructure based on IPv4. It was clear that we needed to come up with a plan on how to deal with that situation well ahead of time. IPv6 obviously needed to be part of that plan, but exactly how wasn’t clear at all.

Conventional wisdom at the time told us that dual stack, i.e., running IPv4 in parallel with IPv6, was the solution. We did some pilot projects, but the results were discouraging. In particular, these problems quickly became apparent:

  1. It would not prevent us from running out of IPv4. After all, dual stack requires just as many IPv4 addresses as single-stack IPv4.
  2. IPv4 would continue to become an ever more entrenched part of our infrastructure. Every new IPv4-using service or application would inevitably make a future IPv4 sunsetting project even more difficult to pull off.
  3. Server and application operators simply didn’t like running two networking protocols in parallel. Dual stack greatly increased complexity: it became necessary to duplicate service configuration, firewall rules, monitoring targets, and so on, just in order to support both protocols equally well. This duplication in turn created lots of new possibilities of things going wrong, reducing reliability and uptime. And when something did go wrong, troubleshooting the issue required more time. Single stack was therefore seen as superior to dual stack.

It was clear that we needed a better approach based on single-stack IPv6, but we were unable to find an already existing one which solved all of our problems.

One of the things that we evaluated, though, was Stateless IP/ICMP Translation (RFC 6145). SIIT looked promising, but it had some significant shortcomings (which RFC 7757’s Problem Statement section elaborates on). In its then-current state, SIIT simply wasn’t flexible enough to be up to the task we had in mind for it. However, we did identify a way SIIT could be improved in order to facilitate our IPv6-only data centre use case. This improvement is what RFC 7757 ended up describing.

I believe the first time I presented the idea of SIIT-DC (under the working name «RFC 6145 ++») in public was at IIS.se’s World IPv6 Day seminar back in June 2011. In case you’re interested in a little bit of «history in the making», the slides (starting at page 34) and video (starting at 34:15) from that event are still available.

A few months later we had a working proof of concept (based on TAYGA) running. By January 2012 I had enough confidence in it to move our corporate home page www.redpill-linpro.com to it, where it has remained since. I didn’t ask for permission…but fortunately I didn’t have to ask for forgiveness either - to this day there have been zero complaints!

The solution turned out to work remarkably well, so in keeping with our open source philosophy we decided to document exactly how it worked so that the entire Internet community could benefit from it. To that end, my very first Internet-Draft, draft-anderson-siit-dc-00, was submitted to the IETF in November 2012. I must admit I greatly underestimated the amount of work that would be necessary from that point on…

The document was eventually adopted by the IPv6 Operations working group (v6ops) and split into three different documents, each covering relatively independent areas of functionality. Then began multiple cycles of peer review and feedback by the working group followed by updates and refinements. I’d especially like to thank Fred Baker, chair of the v6ops working group, for helping out a lot during the process. For a newcomer like me, the IETF procedures can certainly appear rather daunting, but thanks to Fred’s guidance it went very smoothly.

One particularly significant event happened in early 2015, when Alberto Leiva Popper from NIC México joined in the effort as a co-author of RFC 7757-to-be (which describes the specifics of the updated SIIT algorithm). Alberto is the lead developer of Jool, an open-source IPv4/IPv6 translator for the Linux kernel. Thanks to his efforts, RFC 7757-to-be (and, by extension, SIIT-DC) was quickly implemented in Jool, which really helped move things along. The IETF considers the availability of running code to be of utmost importance when considering a proposed new Internet standard, and Jool fit the bill perfectly.

For the record, we decommissioned our old TAYGA-based SIIT-DC BRs in favour of new ones based on Jool as soon as we could. This was a great success - our Jool BRs are currently handling IPv4 connectivity for hundreds of IPv6-only services and applications, and the number is rapidly growing. We’re very grateful to Alberto and NIC México for all the great work they’ve done with Jool - it’s an absolutely fantastic piece of software. I encourage anyone interested in IPv6 transition to download it and try it out.

In late 2015 the documents reached IETF consensus, after which they were sent to the RFC Editor. They did a great job with helping improve the language, fixing inconsistencies, pointing out unclear or ambiguous sentences, and so on. When that was done, the only remaining thing was to publish the documents - which, as I mentioned before, happened last week.

It feels great to have crossed the finish line with these documents, and writing them has certainly been an very interesting exercise. It is also nice to prove that it is possible for regular operators to provide meaningful contributions to the IETF - you don’t have to be an academic or work for one of the big network equipment vendors. That said, it has taken considerable effort, so I certainly look forward to being able to focus fully on my work as a network engineer again. I promise that’s going to result in more good IPv6 news in 2016…watch this space!

Norwegian IPv6 year in review

2016 is soon approaching. In this post I’ll take a look in the rear-view mirror to see how well we did in Norway with regards to IPv6 deployment in 2015. I focus on the status on the end-user side of things, that is, the extent of IPv6 deployment amongst Norwegian ISPs. This is due to the fact that my employer Redpill Linpro mainly provide managed services to content providers, so the traffic entering our dual-stacked data centres can only tell a story about how the ISPs are doing.

End of 2015 status: 7-8% IPv6 adoption

Our customer VG is kind enough to let me use their web site traffic to publish graphs detailing IPv6 deployment in Norway. VG is the largest Norwegian web site, appealing to a broad audience. Therefore the collected data gives a very good basis for making accurate statistics about the Norwegian population in general. The graph below visualises this data, showing how the Norwegian IPv6 adoption rate has developed throughout 2015:

This shows that at the time of writing, 7.3% of all the traffic that reached VG in the previous week was IPv6. While this is an increase compared to the beginning of the year, it is a disappointingly small one - only about a single percentage point.

The graph usually peaks above 8.5% every weekend and drops below 7% during weekdays. This tells us that that Norwegians are much more likely to have IPv6 at home than at work.

It is worth noting that other large content providers are also measuring IPv6 usage in Norway, and their measurements appear to confirm that my numbers are in the right ballpark: Akamai currently reports 7% adoption, while Google reports 7.94%.

On a global scale, Norway is actually quite average. The last few months in Google’s global IPv6 adoption graph are eerily similar to the VG data for the same period. Ranking the countries by their Google-reported IPv6 adoption percentage shows we’re #15 in the world:

  1. Belgium - 39.43%
  2. Switzerland - 26.22%
  3. United States - 22.95%
  4. Portugal - 21.57%
  5. Germany - 20.49%
  6. Greece - 19.05%
  7. Peru - 15.77%
  8. Luxembourg - 15.59%
  9. Czech Republic - 9.84%
  10. Ecuador - 9.64%
  11. Estonia - 9.60%
  12. St. Kitts & Nevis - 9.16%
  13. Malaysia - 8.88%
  14. Japan - 8.77%
  15. Norway - 7.94%

I’d say this is nothing to celebrate, except perhaps that we fare better than all of our Nordic neighbours (but probably not for long, as Finland is #16 and is climbing fast).

How did the Norwegian ISPs fare in 2015?

Norway is essentially an IPv6 duopoly. Over 80% of all IPv6 traffic originates from two ISPs: the incumbent telco Telenor and the cable ISP Get.

On the two next spots we find the NREN UNINETT and the fibre ISP Altibox. These two are responsible for a tiny (but measurable) share of IPv6 traffic each.

The four networks I’ve mentioned account for over 90% of all the IPv6 traffic. The remaining 10% is the long tail, consisting of way too many networks to mention individually here, as they are responsible for only a miniscule amount of IPv6 traffic each.

Below I examine more closely how Telenor and Get fared in 2015. Note that the Y axis of the following graphs shows a percentage of all traffic, i.e., including IPv4 traffic and IPv6 traffic from other ISPs. It does not say anything about how many of Telenor’s or Get’s subscribers are dual-stacked. Unfortunately, I don’t have such statistics at the moment. Akamai does, however.

Zooming in on Telenor

Telenor uses two distinct IPv6 prefixes, allowing me to make two graphs: One for their mobile subscribers, and another for all their wired broadband subscribers (i.e., cable, DSL, and fibre).

The above graph is for Telenor’s wired broadband customers, which is the largest group overall. It is disappointing to see that this group has not grown at all in 2015; rather, it looks like the percentage at the end of the year will be slightly lower than it was at the start of the year! Telenor is a long way from having rolled out IPv6 to their entire customer base, so I am truly hoping that they will pick up the pace again in 2016. (In case you’re wondering, the marked drop in the end of January was caused by a critical problem in their network.)

Telenor’s mobile subscribers are doing much better. Their IPv6 traffic has more than doubled in 2015. It is however quite worrying to see that the trend the last couple of months is clearly a negative one.

Telenor is by far Norway’s largest ISP. In absolute numbers, they are without question the largest source of IPv6 traffic too. However, according to Akamai, only 5.5% of Telenor’s subscribers are IPv6-capable, so there is clearly a huge potential for increased IPv6 deployment in Telenor in the future.

Zooming in on Get

Get is growing their IPv6 deployment. It’s not going very fast, but it is a steady positive trend. (The decline in the summer months is better explained by Get’s customers leaving home to go on holidays than anything Get did.)

According to Akamai, 24.1% of Get’s customers are IPv6-capable. This means Get is the ISP with the largest share of IPv6-capable customers in Norway - well done! At the same time, three out of four of their customers remain IPv4-only, so there is plenty of potential for further improvements in 2016.

Summary and hopes for 2016

If I’m being honest, I must say that 2015 turned out to be a rather disappointing year for IPv6 adoption in Norway. In the second half of 2014 I observed a rapid growth, but this trend did unfortunately not continue in 2015.

I’m hoping that Get and Telenor will intensify their IPv6 deployments in 2016. Especially Telenor has a lot of potential for growth - for example, their cable customers must currently manually opt-in to get IPv6, and all the Apple devices on their mobile network remain IPv4-only. If neither of those two things change in 2016 I’ll be very disappointed.

When it comes to the other major national Norwegian ISPs, I truly hope that 2016 will be the year when I’ll start seeing significant amounts of IPv6 traffic from them. I’m thinking in particular about the likes of Altibox, NetCom, and NextGenTel here.

I’d like to end on a more positive note, though. I’ll do that by commending Difi, a.k.a. The Agency for Public Management and eGovernment, for having made significant progress towards making IPv6 support a mandatory requirement in the Norwegian public sector. In 2016 this will likely become Norwegian “law”. The Norwegian public sector is huge and wealthy, so the moment the service providers start realising that lacking IPv6 support will disqualify them from bidding on lucrative government contracts, I think we’ll see quite a few laggards scrambling to catch up.

Happy New IPv6 Year!

IPv6 network boot with UEFI and iPXE

Here at Redpill Linpro we make extensive use of network booting to provision software onto our servers. Many of our servers don’t even have local storage - they boot from the network every time they start up. Others use network boot in order to install an operating system to local storage. The days when we were running around in our data centres with USB or optical install media are long gone, and we’re definitively not looking back.

Our network boot infrastructure is currently built around iPXE, a very flexible network boot firmware with powerful scripting functionality. Our virtual servers (using QEMU/KVM) simply execute iPXE directly. Our physical servers, on the other hand, use their standard built-in PXE ROMs in order to chainload an iPXE UNDI ROM over the network.

IPv6 PXE was first included in UEFI version 2.3 (Errata D), published five years ago. However, not all servers support IPv6 PXE yet, including the ageing ones in my lab. I’ll therefore focus on virtual servers for now, and will get back to IPv6 PXE on physical servers later.

Enabling IPv6 support in iPXE

At the time of writing, iPXE does not enable IPv6 support by default. This default spills over into Linux distributions like Fedora. I’m trying to get this changed, but for now it is necessary to manually rebuild iPXE with IPv6 support enabled.

This is done by downloading the iPXE sources and then enabling NET_PROTO_IPV6 in src/config/general.h. Replace #undef with #define so that the full line reads #define NET_PROTO_IPV6.

At this point, we’re ready to build iPXE. For the virtio-net driver used by our QEMU/KVM hypervisors, the correct command is make -C /path/to/ipxe/src bin/1af41000.rom. To build a UEFI image suitable for chainloading, run make -C /path/to/ipxe/src bin-x86_64-efi/ipxe.efi instead.

On RHEL7-based hypervisors, upgrading iPXE is just a matter of replacing the default 1af41000.rom file in /usr/share/ipxe with the one that was just built.

Network configuration

The network must be set up with both ICMPv6 Router Advertisements (RAs) and DHCPv6. RAs are necessary in order to provision the booting nodes with a default IPv6 router, while DHCPv6 is the only way to advertise IPv6 network boot options.

When it comes to the assignment of IPv6 addresses, you can use either SLAAC or DHCPv6 IA_NA. iPXE supports both approaches. Avoid using both at the same time, though, as doing so may trigger a bug which could lead to the boot process getting stuck halfway through.

You’ll probably want to provision the nodes with an IPv6 DNS server. This can be done both using DHCPv6 and ICMPv6 RAs. iPXE supports both approaches, so either will do just fine. That said, I recommend enabling both at the same time. It might very well be that some UEFI implementation only supports one of them.

ICMPv6 Router Advertisement configuration

protocol radv {
  # Use Google's public DNS server.
  rdnss {
    ns 2001:4860:4860::8888;
  };
  interface "vlan123" {
    managed no;       # Addresses (IA_NA) aren't found in DHCPv6
    other config yes; # "Other Configuration" is found in DHCPv6 
    prefix 2001:db8::/64 {
      onlink yes;     # The prefix is on-link
      autonomous yes; # The prefix may be used for SLAAC
    };
  };
}

The configuration above is for BIRD. It is all pretty standard stuff, but pay attention to the fact that the other config flag is enabled. This is required in order to make iPXE ask the DHCPv6 server for the Boot File URL Option.

DHCPv6 server configuration

option dhcp6.user-class code 15 = string;
option dhcp6.bootfile-url code 59 = string;
option dhcp6.client-arch-type code 61 = array of unsigned integer 16;

option dhcp6.name-servers 2001:4860:4860::8888;

if exists dhcp6.client-arch-type and
   option dhcp6.client-arch-type = 00:07 {
    option dhcp6.bootfile-url "tftp://[2001:db8::69]/ipxe.efi";
} else if exists dhcp6.user-class and
          substring(option dhcp6.user-class, 2, 4) = "iPXE" {
    option dhcp6.bootfile-url "http://boot.ipxe.org/demo/boot.php";
}

subnet6 2001:db8::/64 {}

The config above is for the ISC DHCPv6 server. The first paragraph declares the various necessary DHCPv6 options and their syntax. For some reason, ISC dhcpd does not appear to have any intrinsic knowledge of these, even though they’re standardised.

The second paragraph ensures the server can advertise an IPv6 DNS server to clients. In this example I’m using Google’s Public DNS; you’ll probably want to replace it with your own IPv6 DNS server.

The if/else statement ensures two things:

  1. If the client is an UEFI firmware performing IPv6 PXE, then we just chainload an UEFI-compatible iPXE image. (As I mentioned earlier, I haven’t been able to fully test this config due to lack of lab equipment supporting IPv6 PXE.)
  2. If the client is iPXE, then we give it an iPXE script to execute. In this example, I’m using the iPXE project’s demo service, which boots a very basic Linux system.

Finally, I declare the subnet prefix where the IPv6-only VMs live. Without this, the DHCPv6 server will not answer any requests coming from this network. Since I’m not using stateful address assignment (DHCPv6 IA_NA), I do not need to configure an IPv6 address pool.

Conclusion

Thanks to iPXE and UEFI, network boot can be made to work just as well over IPv6 as over IPv4. The only real remaining problem is that many server models still lack support for IPv6 PXE, but I am assuming this will become less of an issue over time as they upgrade their UEFI implementations to version 2.3 (Errata D) or newer.

In virtualised environments, nothing is missing. Apart from the somewhat annoying requirement to rebuild iPXE to enable IPv6 support, it Just Works. This is evident from by the boot log below, which shows a successful boot of a QEMU/KVM virtual machine residing on an IPv6-only network.

[root@kvmhost ~]# virsh create /etc/libvirt/qemu/v6only --console
Domene v6only opprettet fra /etc/libvirt/qemu/v6only
Connected to domain v6only
Escape character is ^]

Google, Inc.
Serial Graphics Adapter 06/09/14
SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $ (mockbuild@) Mon Jun  9 21:33:48 UTC 2014
4 0

SeaBIOS (version seabios-1.7.5-8.el7)
Machine UUID ebe11d4a-11d4-4ae8-b249-390cdf7c79ec

iPXE (http://ipxe.org) 00:03.0 CA00 PCI2.10 PnP PMM+7FF979E0+7FEF79E0 CA00

Booting from Hard Disk...
Boot failed: not a bootable disk

Booting from ROM...
iPXE (PCI 00:03.0) starting execution...ok
iPXE initialising devices...ok

iPXE 1.0.0+ (f92f) -- Open Source Network Boot Firmware -- http://ipxe.org
Features: DNS HTTP iSCSI TFTP AoE ELF MBOOT PXE bzImage Menu PXEXT

net0: 00:16:3e:c2:16:b7 using virtio-net on PCI00:03.0 (open)
  [Link:up, TX:0 TXE:0 RX:0 RXE:0]
Configuring (net0 00:16:3e:c2:16:b7).................. ok
net0: fe80::216:3eff:fec2:16b7/64
net0: 2001:db8::216:3eff:fec2:16b7/64 gw fe80::21e:68ff:fed9:d156
Filename: http://boot.ipxe.org/demo/boot.php
http://boot.ipxe.org/demo/boot.php.......... ok
boot.php : 127 bytes [script]
/vmlinuz-3.16.0-rc4... ok
/initrd.img... ok
Probing EDD (edd=off to disable)... ok

iPXE Boot Demonstration
=======================

Linux (none) 3.16.0-rc4+ #1 SMP Wed Jul 9 15:44:09 BST 2014 x86_64 unknown

Congratulations!  You have successfully booted the iPXE demonstration
image from http://boot.ipxe.org/demo/boot.php

See http://ipxe.org for more ideas on how to use iPXE.

root:/#

Evaluating DHCPv6 relays

One of the few remaining IPv4-only services here at Redpill Linpro is our provisioning infrastructure, which is based on PXE network booting. I’ve long wanted to do something about that. Now that more and more servers are shipping with UEFI support, I am finally in a position to start looking at it.

I’m starting out by figuring out which DHCPv6 relay implementation we’ll be using. This post details my evaluation process, conclusion, and choice.

Network topology

Most of the servers we want to provision are usually located in a dedicated customer VLAN that is connected to a set of redundant routers running Linux. The routers speak VRRP on each of the VLANs in order to decide which router is the primary one serving the VLAN in question.

Furthermore, each of the routers have multiple redundant uplinks to the core network, and are using a dynamic IGP to ensure optimal routing and fault tolerance. The DHCPv6 server is reached through the core network using unicast routing.

The following figure illustrates the topology:

Our desired capabilities

Our current IPv4-only network boot infrastructure relies heavily on using Ethernet MAC addresses to distinguish between clients. Being able to continue to do so will make the introduction of IPv6 support quick and easy. We would therefore like for the implementation to support the DHCPv6 Client Link-Layer Address Option.

As discussed in the previous section, the network configuration on the routers is dynamic and could change without notice. An ideal DHCPv6 relay implementation would be able to notice such changes and automatically adapt to the new environment. In our environment, this would mean being able to cope with:

  • A new VLAN interface showing up, e.g., when we provision a new customer.
  • The IP address configuration on a VLAN interface changing, e.g., due to a VRRP fail-over event.
  • The route to the DHCPv6 server changing from one uplink interface to another, e.g., due to changed route metrics in our core network.

Finally, regarding the software itself, we’d like for it to be:

Available implementations and their capabilities

From what I was able to determine, there are four available open-source DHCPv6 relay implementations. These are, in alphabetical order:

The versions I tested are shown in parenthesis.

Of the tested implementations, only Dibbler supported this feature. It is enabled by adding the line option link-layer in the configuration file.

Desired capability 2: Detecting new interfaces on the fly

Disappointingly enough, none of the tested implementations were able to do this. dhcpv6 will by default listen on all available interfaces, but it does not detect new interfaces showing up after it has started.

The other three implementations all require that the listening interfaces be configured explicitly.

Desired capability 3: Detecting IPv6 addresss changing during runtime

Only WIDE-DHCPv6 was able to do this. It appears to check what the local address on the interface is every time it relays a packet, so it always sets the link address field in the relayed DHCPv6 packet correctly.

The other three implementations read in the global address (or lack thereof) for each interface when they start, and do not notice any changes. Thus, there is a risk that the link address field in their relayed packets is set incorrectly.

Desired capability 4: Coping with route to DHCPv6 server changing

Only dhcpv6 supports this without any weirdness. The address of the DHCPv6 server is specified with the -su command line option, and packets are relayed to it using a standard routing lookup.

ISC DHCP and WIDE-DHCPv6 behave in a rather bizarre way. They both require that the interface facing the DHCPv6 server is explicitly specified on the command line, but for some reason they completely ignore it and instead use a standard routing lookup to reach the server.

Dibbler also requires that the upstream interfaces are explicitly configured. If there is no route to the DHCPv6 server on one of these interfaces, it will log the following error for each DHCPv6 request:

Low-level layer error message: Unable to send data (dst addr: 2001:db8::d)
Failed to send data to server unicast address.

That said, it is possible to simply configure both eth0 and eth1 as upstream interfaces. This will ensure all requests are correctly relayed regardless of which interface has the active route to the DHCPv6 server. That said, I’m only awarding half a point to Dibbler here, both due to the clunkyness of the workaround and the constant stream of error messages it will result in.

Desired capability 5: Free and open-source software

Yes! Every tested implementation qualifies.

Desired capability 6: Actively maintained

Only Dibbler and ISC DHCP appear to be. According to its own homepage, dhcpv6 was discontinued in 2009. WIDE-DHCPv6 has not seen any release since 2008.

Desired capability 7: Available in Ubuntu’s software archive

Only dhcpv6 is missing, the rest are an apt-get install away.

Conclusion

Out of a maximum 7 points, the final scores are as follows:

  1. Dibbler: 4.5 points
  2. ISC DHCP: 4 points
  3. WIDE-DHCPv6: 4 points
  4. dhcpv6: 2 points

Disappointingly enough, none of them are able to run continously in a dynamic environment like ours. To work around this, we’ll probably have to devise a system that automatically generates new configuration and restarts the relay whenever a network configuration change is detected. Should a DHCPv6 request arrive exactly when the relay is being restarted, it will likely be retried within seconds, so this is extremely unlikely to cause any operational issues.

This workaround will handle the lack of desired capabilities 2 through 4. After disregarding these, only Dibbler gets full score (due to its support for the Client Link-Layer Address Option). Dibbler is thus the obvious choice.