Tuesday, June 20, 2017

Storage Spaces Direct (S2D) Storage Jobs Suspended and Degraded Disks

Storage spaces direct is great, but every once and a while a S2D storage job will get a stuck and just sit there in a suspended state. This usually happens after a reboot of one of the nodes in the cluster.

What you don't want to do is take a different node out of the cluster while a storage job is stuck and while there are degraded virtual disks.

You should make a habit out of checking the jobs and the virtual disk status before changing node membership. You can do this easily with the Get-StorageJob and Get-VirtualDisk commandlets. Alternatively, you could use the script I wrote to continually update the status of both the S2D storage jobs and the virtual disk status.

So what does one do if a storage job is stuck? There are two commandlets that I've found will fix this. The first is Optimize-StoragePool. The second is Repair-VirtualDisk. Start with Optimize-StoragePool and if that doesn't work then move on to Repair-VirtualDisk. Here is how you use them:

Get-StoragePool <storage pool friendly name> | Optimize-StoragePool

Example: Get-StoragePool s2d* | Optimize-StoragePool

Get-VirtualDisk <virtual disk friendly name> | Repair-VirtualDisk

Example: Get-VirtualDisk vd01 | Repair-VirtualDisk

Usually optimizing the storage pool takes care of the hung storage job and fixed the degraded virtual disk but if not target the disk directly.

If neither of those work, give Repair-ClusterStorageSpacesDirect / Repair-ClusterS2D a try. I haven't tried this one yet but it looks like it could help.

Update: I tried Repair-ClusterS2D. It does not appear to help with this scenario. There is limited documentation on it but it looks like it's something you use if a virtual disk gets disconnected or something.

Thursday, February 23, 2017

S2D Continually Refresh Job and Disk Status

In storage spaces direct you can run Get-StorageJob to see the progress of rebuilds/resyncs. The following powershell snippet allows you to continually refresh the status of the rebuild operation so that you know when things are back to normal.

function RefreshStorageJobStatus () { while($true) { Get-VirtualDisk | ft; Write-Host "-----------";  Get-StorageJob;Start-Sleep -s 1;Clear-Host; } }

Enter the above in powershell on one line. Then enter "RefreshStorageJobStatus" to start the script. The output should look similar to the following and refresh every second:

Name   IsBackgroundTask ElapsedTime JobState  PercentComplete BytesProcessed BytesTotal
----   ---------------- ----------- --------  --------------- -------------- ----------
Repair True             00:00:13    Suspended 0               0              7784628224
Repair True             00:00:06    Suspended 0               0              7784628224



FriendlyName ResiliencySettingName OperationalStatus HealthStatus IsManualAttach Size
------------ --------------------- ----------------- ------------ -------------- ----
vd01                               OK                Healthy      True           1 TB
vd03                               Degraded          Warning      True           1 TB
vd02                               Degraded          Warning      True           1 TB
vd04                               OK                Healthy      True           1 TB


You can press ctrl-c to stop the execution.

Monday, February 13, 2017

AD-less S2D cluster bootstrapping

AD-less S2D cluster bootstrapping - Domain Controller VM on Hyper-converged Storage Spaces Direct


Is it a supported scenario to run a AD domain controller in a VM on a hyper-converged S2D cluster? We're looking to deploy a 4-node hyper-converged S2D cluster at a remote site. We would like to run the domain controller for the site on the cluster so we don't need to purchase a 5th server. Will the S2D cluster be able to boot if the network links to the site are down (meaning other domain controllers are not accessible)? I know WS2012 allowed for AD-less cluster bootstrapping but will the underlying mechanics uses for storage access in S2D in WS2016 work without AD? Is this a supported scenario? AD-less S2D cluster bootstrapping?

I asked this question in the Microsoft forums. I did not get a definitive answer from anyone. So I set it up and tested it and it appears to work. I don't know if it's officially supported or not but it does work. The S2D virtual disks and volumes comes up with out a domain controller. At which point you can start the domain controller VM if it did not start automatically. I didn't dig into things, but I have a feeling it's using NTLM authentication and would likely fail if your domain requires Kerberos?