Tuesday, July 10, 2018

The Proper Way to Take a Storage Spaces Direct Server Offline for Maintenance?

Way back in September 2017 a Microsoft update changed the behavior of what happened when you suspended(paused)/resumed a node from a S2D cluster. I don't think the article "Taking a Storage Spaces Direct Server offline for maintenance" has been updated to reflect this change?

Previous to the September 2017 update, when you suspended a node, either view powershell Suspend-ClusterNode or via the Failover Cluster Manager GUI Pause option, the operation would put all the disks on that node in maintenance mode. Then when you resumed the node, the resume operation would take the disks out of maintenance mode.

The current suspend/resume logic does nothing w/ the disks. If you suspend a node it's disks don't go into maintenance mode and if you resume the node nothing is done to the disks.

I postulate what you need to do after you suspend/drain the node and prior to shutting it down or restarting it is put the disks for that node into maintenance mode. This can be done with the following powershell command:

Get-StorageFaultDomain -Type StorageScaleUnit | Where-Object {$_.FriendlyName -eq "node1"} | Enable-StorageMaintenanceMode

Be sure to change "node1" to the name of the node you've suspended in the above powershell snippet.

When the node is powered on/rebooted, prior to resuming the node, you need to take the disks for that node out of maintenance mode. Which can be done with the following command:

Get-StorageFaultDomain -Type StorageScaleUnit | Where-Object {$_.FriendlyName -eq "node1"} | Disable-StorageMaintenanceMode

The reason I'm thinking this should be done is that if you just reboot the node without putting the disk in maintenence mode then the cluster will behave as if it lost that node. Things will recover eventually but timeouts may occur and if you're system is extremely busy with IO bad things could happen (VMs rebooting, CSVs moving, etc.) I'm thinking it's better to put the disks for the node in maintenance mode so all the other nodes know what's going on and the recovery logic doesn't need to kick in. Think of it this way, it's better to tell all the nodes what's going on then to make them have to figure out what's going on... I need to test this theory some more...

Friday, June 29, 2018

View Physical Disks by Node in S2D Cluster

Quick snippets to view physical disks by node in S2D cluster:

Get-StorageNode |%{$_.Name;$_ | Get-PhysicalDisk -PhysicallyConnected}

The following is useful if you're looking at performance monitor and you're trying to figure out which device number is which:

gwmi -Namespace root\wmi ClusPortDeviceInformation | sort ConnectedNode,ConnectedNodeDeviceNumber,ProductId | ft ConnectedNode,ConnectedNodeDeviceNumber,ProductId,SerialNumber

Thursday, May 17, 2018

Trouble Shooting S2D / Clusters

So you have a problem with S2D or you just fixed a problem and you want to try and figure out why it happened so it doesn't happen again. Execute the Get-ClusterDiagnosticInfo powershell script. It will create a zip file in c:\users\<username>\ that contains the cluster logs and all relavent events and settings. Do this right away so you have the data. You can always analyze it at a later date.

Sunday, May 13, 2018

S2D Replacing PhysicalDisk Quick Reference

List phsyical disks to find failed disk. Note serial number.
Get-PhysicalDisk | Get-StorageReliabilityCounter

List virtual disks that use the drive, remember them for later.
Get-PhysicalDisk -SerialNumber A1B2C3D4 | Get-VirtualDisk

"Retire" the Physical Disk to mark the drive as inactive, so that no further data will be written to it.
$Disk = Get-PhysicalDisk -SerialNumber A1B2C3D4
$Disk | Set-PhysicalDisk -Usage Retired

S2D should start a to rebuild the virtual disks that utilized the drive.

To be extra safe, run the following on each of the virtual disks that was listed above.
Repair-VirtualDisk -FriendlyName 'VirtualDiskX'

This storage jobs will likely take some time.

Remove the retired drive from the storag pool.
Get-StoragePool *S2D* | Remove-PhysicalDisk –PhysicalDisk $Disk

Physically remove the bad disk.
Physically add a new disk (could peform this first if you have empty drive bays) and check to see if it was added to the storage pool.
Get-PhysicalDisk | ? CanPool –eq True

If nothing is returned it should have been added to the pool, this is what you want as S2D should claim all disks.

If it wasn't added to the pool try the following:
$newDisk = Get-PhysicalDisk | ? CanPool –eq True
Get-StoragePool *S2D* | Add-PhysicalDisk –PhysicalDisks $newDisk –Verbose

Find the new disk's serial number and then see if any virtual disks are using it. None should be yet.
Get-PhysicalDisk -SerialNumber NEWSNBR | Get-VirtualDisk

Rebalance storage pool
Get-StoragePool *S2D* | Optimize-StoragePool
Get-VirtualDisk | Repair-VirtualDisk

Now virtual disks should be using it
Get-PhysicalDisk -SerialNumber NEWSNBR | Get-VirtualDisk

Friday, May 11, 2018

Resize S2D Volume Quick Reference

Get-VirtualDisk vd01
Get-VirtualDisk vd01 | Get-StorageTier | Resize-StorageTier -Size 1.5TB
Get-VirtualDisk vd01

Get-VirtualDisk vd01| Get-Disk | Get-Partition | Get-Volume
$VirtualDisk = Get-VirtualDisk vd01
$Partition = $VirtualDisk | Get-Disk | Get-Partition | Where PartitionNumber -Eq 2
$Partition | Resize-Partition -Size ($Partition | Get-PartitionSupportedSize).SizeMax
Get-VirtualDisk vd01| Get-Disk | Get-Partition | Get-Volume

Thursday, May 10, 2018

The Case Against 2-Node S2D Solutions and 2-Way Mirroring

So I've got a two node S2D cluster cooking. The last two times I patched it one of the volumes lost redundancy (just one volume.) The first time it happened I couldn't figure out how to fix it. I ended up blowing the volume away, creating a new one and restoring from backups. This lead me down the path of trying to figure out how to fix this issue in the future which lead to this blog post.

The second time my volume lost redundancy after rebooting a server I thought I was ready for it, since I figure out how to resolve the no redundancy state. Preparing for the worst though, I copied all the VMs off of it so I would have more recent state then from backup. All of the VMs copied except for one. I don't recall the error message it gave me but I think it said something about being unable to read from the disk. This should have been my first clue as to the root cause.

In any case, I had a volume with no redundancy and I attempted the steps I discovered to recover the volume. It didn't work. No matter what I tried. I ended up blowing away the volume again, recreating a new volume and restoring the VMs.

After further investigation it would appear that one of the disks is going bad. I determined this by running the following:

> Get-PhysicalDisk | Get-StorageReliabilityCounter

DeviceId Temperature ReadErrorsUncorrected Wear PowerOnHours
-------- ----------- --------------------- ---- ------------
2                    0                     0    114
5000                                       0    1430
5012                 648                   0    1064
5004                                       0    1417
5006                 0                     0    1051
5010                 0                     0    1064
5009                 0                     0    1051
5003                                       0    1417
5011                 0                     0    1064
5008                 0                     0    1050
5013                 0                     0    1064
5007                 0                     0    1050
5001                                       0    1430

When I run Get-PhysicalDisk all the disks return healthy though. So, the disk is starting to have issues but not enough for the system to think the disk is total garbage yet?

Turns out when I restarted the server without the failing disk, the server WITH the failing disk was the only source of data and it couldn't read form all portions of the failing drive. Hence the no redundancy state. I'm thinking it couldn't read a sector and it couldn't find a redundant copy of the data. Now if this was a three node cluster with 3-way mirroring it could have read from the tertiary copy.

I'm not sure why S2D doesn't take a more proactive approach to resolve the failing disk or at least highlight it more. I'm also not sure why it wouldn't allow me to attach the disk after both nodes were back online. Perhaps the "There is not enough redundancy remaining to repair the virtual disk" warning was because S2D wanted to try and move the data but I needed to add another disk? I was only using 4 TB out of 24TB though, you'd think S2D could move everything off the failing disk to the available space... Perhaps it couldn't attach the disk because changed data could not be replayed from the failing disk to restore the mirror?

I would rather S2D evict the disk right away and make the issue at hand obvious. Or create another health state that indicates a physical disk is in a failing state and trickle that status all the way up to the virtual disks and volumes as unhealthy. Or give us an option to set the number of URE (unrecoverable read errors) threshold for failing a disk.

Long story short, check your disks and make sure none of them are going bad before your reboot servers if you have a two node S2D cluster or if you implement two-way mirroring. Also if you do encounter the no redundancy state it's best to copy as much data off of it as you can before trying to fix it.

Another take away from this is that it would seem wiser to create more smaller volumes instead of fewer larger volumes, of course keeping the number of volumes a multiple of your node count.

Update: I got the following response from Microsoft
"The challenge here is that you had a misbehaving drive… and that’s kind of a gray area.  We handle very well when drives work great… and we handle very well when they fail completely.  But when does bad… become bad enough?  And how do we balance not generating false positives that makes you go replacing drives unnecessarily, and pointlessly wasting money.   With that said, this is an area we are working on.  In Windows Server 2019 we are making enhancements to our Health Service to add what we term marginal drive handling right now (we’ll come up with a better name by ship).

We also hear the feedback that some customers may want higher resiliency out of a 2-node solution, that is another problem we are looking at.  Be mindful that it will come at a cost of reduced efficiency…  but we want to offer customers the choice to do what makes sense for their deployment scenario."
So Microsoft is workign on improving the experience with 2-way mirrors and 2-node S2D deployments. I commend the S2D team, they're very responsive to emails and they listen to what customers have to say. I'm excited to see the improvements with Windows Server 2019.

I just wish there was a way to tweak the algortihm that decides when a drive is bad. I'd personally fail it sooner then later.

Thursday, May 3, 2018

Hyper-V SET and NLB in Guest, Duplicate Frames/Packets

Just a quick blurb about switch embedded teaming and Microsoft network load balancing in guest virtual machines. If you've got NLB setup in multicast operational mode running in a VM on top of a Hyper-V host that has SET configured, your VM will receive duplicate frames/packets. It would appear that all the NICs in the SET receive the data and pass it up the network stack. If your upper layer network protocols handle duplicat packets you don't necessarily have to worry about it. ICMP ping for example does not and you will receive multiple responses from your VM. For NLB I would abondon SET and use active/passive NIC teaming.