By Alasdair Lumsden on 25 Nov 2013
Hi Folks, welcome to the first illumos watch blog post, where each month I’ll be sharing some of the latest developments in the world of illumos, OpenSolaris, SmartOS and OpenIndiana!
As this is the first post, I’ll include a few from earlier months that I thought were noteworthy too…
Commits down at the illumos-gate
My personal favourite, this increases the username character limit from 8 characters to 32. While it was possible to use >8 character usernames, they didn’t work everywhere and various utilities would complain.
The man page says it best:
The ipdadm utility is used to administer the illumos facility for simulating pathological networks by induce packet drops, delays, and corruption. This functionality is only able to the global zone and zones with exclusive networking stacks. If this is enabled for the global zone, any zone with a shared networking stack will be affected.
More info on the Joyent Blog: What is ipdadm(1M) Used For?.
This one has been in SmartOS for quite some time, and it’s great to see it in illumos as it’s long overdue. The default list option prints values in human-readable form (e.g. MB/GB/TB) – with the new -p option you can print them in a machine parsable format, which for sizes means bytes, and for times its seconds since the epoch, perfect for scripting:
This one is fairly self explanatory, you can now inspect TCP kstats of a zone from the global zone, which is useful for mass monitoring without requiring agents in every zone.
This one I’m curious to see in action. At the moment if you boot a machine up with a missing drive, it shows up as state "UNAVAIL". There’s also state "REMOVED", which in my experience is correctly used if a drive is erroring and is removed by the OS. However I can’t recall what happens if you just pull a drive out, so perhaps this addresses that. Either way, improvements such as this are always useful.
Fixes and minor enhancements
There’s a long list of fixes and minor enhancements, notable ones include richlowe adding improved argument handling for ld(1) (4270), logadm improvements (4301, 4300, 4299), zfs deadlock fixes (4161, 4322), tail -f improved to notice file truncation (3928).
There’s also a fix for a very serious kernel panic issue in the new the NFS lock manager, on the client side (4198). We hit this bug in production, special thanks to Marcel Telka for fixing this so darn quickly.
Commits over at SmartOS Towers
ZFS Throttle Improvements
Joyent have done some work to improve commenting of the ZFS IO throttling code, along with enhancing it slightly. Enhancement mostly covered by this algorithm:
* To minimize porpoising, we have three separate states for our * assessment of I/O performance: overutilized, underutilized, and * neither overutilized nor underutilized. We will increment the * throttle if a zone is using more than its fair share _and_ I/O * is overutilized; we will decrement the throttle if a zone is using * less than its fair share _or_ I/O is underutilized. */
- OS-1315 Please, oh please, let me clear N services in maintenance across zones from the GZ
- OS-2566 Want svcadm restart -d […] for taking cores before restart
- OS-2567 SMF: allow svcadm to act on multiple instances simultaneously
- OS-2574 svcadm could use -Z option
The titles are mostly self explanatory, and it’s nice to see this tool receive enhancement as its used so frequently.
OS-2556 make existing zfs filesystem limit feature obsolete
This one is quite interesting – Joyent added a filesystem limits feature to limit the number of child datasets, snapshots, clones, etc that a parent dataset can create. This is useful for zone-delegated datasets in a multi-tenanted environment. Here, the actual feature isn’t being obsoleted, but the code reworked for upstreaming into illumos. (Thanks to Robert Mustacchi for providing more info on this).
OS-2495 add support for multiple mac addresses per client
Another useful feature, this adds multiple mac access support to the dladm layer, presumably to allow KVM guests to have multiple MAC addresses.
OS-2544 ipf rules from the GZ should be add to in-zone rules,…
… not replace them
* For each non-global zone, we create two ipf stacks: the per-zone stack and * the GZ-controlled stack. The per-zone stack can be controlled and observed * from inside the zone or from the global zone. The GZ-controlled stack can * only be controlled and observed from the global zone (though the rules * still only affect that non-global zone). * * The two hooks are always arranged so that the GZ-controlled stack is always * "outermost" with respect to the zone.
I wasn’t aware of this feature, but it seems on SmartOS there are two IPF firewall stacks per zone, allowing you to control firewalling from the global zone. I can see this being a very useful feature if you want to mandate global firewall policy.
In the hosting world, its common to also provide basic "network-level" firewalling, which clients will assume is being done by some perimeter firewall, but if you can do it from the global zone, all the better – logically there’s no real distinction from a clients perspective.
From the Water Cooler
L2ARC Persistence & ZFS UNMAP/SATA TRIM
Saso Kiselkov has been working on some excellent ZFS features over the past year, such as L2ARC cache persistency and ZFS SCSI Unmap/SATA TRIM support. By the sounds of it the ZFS bit is pretty much done, but the SCSI layer isn’t. Saso is asking for reviews of his L2ARC persistency code, so hopefully that will arrive in-gate soon. More on the thread over on listbox.
N-Way mirror read performance, Steven Hartland (FreeBSD)
Looks like some work done by Steven Hartland of the FreeBSD project on N-Way mirror read performance is on its way to illumos, although the thread it generated was quite a long one.
This is quite interesting in light of the benchmark work I did on the different ZFS RAID Levels (Google Docs results, Blog Post), where ZFS RAID10 read performance is already ahead of RAIDZ by quite a bit.
Ongoing 4k Sector Drive Discussions
It seems a month doesn’t go by where there aren’t more discussions about 4k sector drives/zpool ashift settings:
- Inefficient zvol space usage on 4k drives
- RAIDz and 4k Sectors
- forward-looking zpools: ashift=12 and smaller vdevs
I think basically avoid 4k sector drives if you can. If you must use them, try and avoid RAIDZ. Think of them as being for bulk storage, not VM storage or general purpose usage.
This Delphix blog post by George Wilson on 4K Sectors, ZFS and ashift is useful.
OpenJDK vs Oracle JDK
A short thread on OpenJDK vs Oracle JDK cropped up on the SmartOS list. The short answer is of course, there’s not much of a difference but the binary Oracle JDK can’t be redistributed. Alain O’Dea reports from the field that he’s been using OpenJDK for over a year in production without issue, which is reassuring.
Fascinating exchange between Andrew Galloway and Richard Elling
Here’s an example of a discussion between two different styles of thinking. Personally, I’d side with Andrew – statements should reflect the common case found in the real world.
The takeaway is that with illumos as it stands, you may, depending on your workload, want to limit your ARC to 128GB of RAM until Nexenta upstream a fix to a performance problem with large ARCs.
High Availability with Zones
Providing high availability with Zones often requires some kind of IP Address failover. Joyent have managed to get VRRP almost working (it’s still a bit buggy), but fear not – there are multiple options available to solve this particular issue. At EveryCity we use Wackamole with Spread, and there’s also ucarp. The VRRP and 2 HAProxy on different smartos servers thread discusses the different approaches.
There’s a good write up by Guillaume Hilt on the SmartOS wiki: High Availability with Wackamole. Theo Schlossnagle, co-creator of Wackamole mentioned that people should be using Vippy, which is the successor to Wackamole.
OpenIndiana Hipster work continues
It’s good to see that work continues on the hipster branch of OpenIndiana at quite a pace. I’m also pleased to see some other names committing, such as Aurélien Larcher, Gordon Ross of Nexenta and Marcel Telka. Plus a fantastic stream of work from Alexander Pyhalov as always followed closely by Adam Števko and Andrzej Szeszo.
Whoa, what a lot of goings on in the past month or so. I didn’t realise this blog post would end up quite so big! It just goes to show that the illumos and friends community is really thriving in this post-Oracle world. Catch you all next month for more updates… :-)