Killing a Solaris 10 Zone stuck in the shutting_down state.

0 Flares 0 Flares ×

So, you have a Solaris 10 Zone. You’ve run “zoneadm -z zonename shutdown”. It hasn’t quite shut down, and is stuck in the shutting_down state. What can you do to fix it?

Well, sometimes some processes don’t die in a timely fashion. Check what processes are running with the following command:

# ps -fz zonename

If any processes other than zsched are running, kill -9 them. The zone should hopefully shut down.

If it doesn’t, and you’re left with zsched as the only remaining process, then potentially you’ve hit a bug, such as bug 6272846 – “User orders zone death; NFS client thumbs nose”. This bug has been outstanding since May 2005, so don’t expect a fix any time soon.

Thankfully there are a few more things you can try to kill the damn zone off. Give some of the following a go:

# zoneadm -z zonename unmount -f
# zoneadm -z zonename reboot -- -s 
# pkill -9 -z zonename

The above combo should hopefully deliver a fatal blow to your Zone. If not, bitch at Sun. Hopefully they’ll sort their lives out.

0 Flares Twitter 0 Facebook 0 Google+ 0 Reddit 0 LinkedIn 0 StumbleUpon 0 Email -- 0 Flares ×

Tags: , , , , ,

9 Responses to “Killing a Solaris 10 Zone stuck in the shutting_down state.”

  1. Brian says:

    2012 – and same issue, with little on Oracle’s web-site for proper procedures to clean-up mount & zone

    Created a case with Oracle for the fun of it.

    tried everything with exception to the pkill

    waiting for Oracle

    Thanks for the info

    -Brian

    • Alasdair says:

      Hi Brian,

      There are existing bugs in the Solaris codebase related to shutting down zones. If you’re not wedded to Oracle, I’d recommend checking out SmartOS. They have fixed most (all?) of them and we haven’t had any issues shutting down Zones since moving to it.

      Good luck with Oracle support!

  2. Bruce K says:

    in many cases, when a zone is stuck in “down” state when shuting down, it is because you have a session in the global zone in which you are in a path of the zone you are shutting down. e.g. /zones//root/…..
    e.g.in this example zone “laker” is stuck in the “down” state. so check all your windows or try fuser to see who was in any of lakers filesystems via the global zone and kill the process or cd out of it.
    caramel# zoneadm list -cv
    ID NAME STATUS PATH BRAND IP
    0 global running / native shared
    5 bobcat running /zones/bobcat native shared
    6 laker down /zones/laker native shared

    check all your opened windows in global zone and you will find one that is no longer in a valid path. just cd + enter or cd / and zone that was stuck in the “down” state will come all the way down.

  3. rajeesh says:

    OMG!!! this works…thanks a ton .. i was about to reboot the server for this bloody prob!!! thanks mate thanks a ton .

  4. sharath says:

    The umount command worked. Awesome!!!

  5. W Sanders says:

    Still not fixed as of 2013

    • Alasdair says:

      W Sanders,

      Is that on the official Solaris 10/11?

      As I mentioned above, it’s fully fixed in SmartOS, thankfully. We switched to SmartOS and haven’t looked back – recently migrated over 250 Solaris 10 Zones to SmartOS and are very happy with the results :-)

      If you’re in a corporate environment, SmartDatacenter may be worth looking at, which is Joyent’s commercial product based on SmartOS.

  6. Nakarti says:

    I did this on a global thats scheduled for reboot. The zones are stuck with a sync process that I can’t kill. Good thing a reboot was planned!

  7. Alasdair says:

    Nakarti,

    Did it work? It might not, “sync” processes usually indicate stuck IO, which is a more serious condition. Which Solaris version was this on?

    I can report that we haven’t had any stuck zones since moving to SmartOS, as Joyent have hunted down and fixed all the bugs thanks to their massive cloud scale.

Leave a Reply

Back to top