Good day IT folks, it's Trevor coming back at you with another real life scenario that sparked my interest today, that I thought would be valuable to share with the broader community (wish I stumbled across this previously!). Let me set the stage.
THE STAGE / SOME BACKGROUND:
So I am onsite for a customer and the scope of work is quite simple. Migrate SCCM from on-premise infrastructure, to Azure IaaS instances. The networking and back-end Azure components have already been in place for quite a while, as this particular customer has a FY goal to move off of all on-premise ESXi hosts, to Azure. Sounds simple right? Network is there (through a marvelous ExpressRoute), security is in place, and surprisingly RBAC models have been adopted appropriately per our Azure Reference Architecture Recommendations. This sets us up perfectly for a standard SCCM side-by-side migration, we've done this hundreds of times in the past, yet interestingly it never ceases to amaze me that an issue I've never experienced in 12+ years with this tool set, of course I encounter during this engagement. Let's dive in.
We are nearing the completion of the migration as all desired objects have been migrated, source content has been flipped, a small test of clients have been re-assigned, and all things are flowing wonderfully. One of the last steps I perform with customers before going into further planning sessions, is showing how easy it is to "Share Distribution Points" as you're in a cut-over process.
For those of you who might not be aware of this nifty feature, this allows you to "Share" distribution points from your old source hierarchy, with clients assigned to your new destination hierarchy, then at a later date "Re-Assign" these distribution points, which will kick off a Stored Procedure that flips the management of the content libraries and the DP roles as a whole to your new site, ALL while not having a requirement to re-distribute that content. It's been around for a while, and certainly is a time and WAN saver.
Okay, ready? Let's go.
While running a discovery of the source hierarchy through Administration -> Migration -> Source Hierarchy, I had clicked the checkbox for "Enable Distribution Point Sharing" as one of my last tasks with this customer. During such discovery, I was monitoring the MIGMCTRL.LOG and noticed while it was able to discover the majority of the data, once it got to the point where it runs certain commands against the source hierarchy to get the NALPath (FQDN) of any site systems running the DP role, and performs a test connection through WMI to ensure proper permissions are set to share the DP's between hierarchies - failures were occurring. For reasons unannounced to me this team had an improper flattening of their hierarchy over a year ago, and have had this 'rogue' DP since then. What do I mean by 'rogue'? Well, this particular server USED to be a DP, but was no longer a DP on the source site, so why was SCCM / migration manager finding it as being a DP? This is where my investigation began:
Error in log: (migmctrl.log)
"Get the NALPath of active distribution points of current v5 site. SqlCommand = Select ServerName from DistributionPoints where DPFlags <> 1"
"Start syncing distribution points"
......"Couldn't find the specified instance SMS_SCI_SysResUse.FileType=2,NALPath='["Display=\\\\servernamehere"]'"
So what's happening? It's querying the DB in a few manners, and looking for certain true statements, of which it finds this server in particular.
I decided to hop on the source hierarchy and login to the DB and run the following query:
select * from DistributionPoints
Oddly enough, the DP in question did not return... Strange. So where is migration manager getting this from then?
I looked deeper.
I noticed in the log it was looking in WMI:
Get-WmiObject -Namespace root\sms\site_<yoursitecode> -Class SMS_SCI_SysResUse -Filter "NALPath like '%servernamehere%"
This was found on the site - so I looked in the representative table in SQL:
select * from SC_SysResUse where NALPath like '%servernamehere%'
What do you know, there it is! At this point I knew this was the root issue, and confirming once again this is indeed NOT a true DP, I simply performed a delete command (With proper Support Engineering over my shoulder):
Delete from SC_SysResUse where NALPath = 'insertservernamehere'
Once this completed, I re-ran discover data now within the Migration node of the console, and it was successfully able to complete.
So how did this happen?:
Easy, an improper deletion of objects within SCCM, it appeared to me that someone was unable to properly decommission the DP at the time, and hacked the registry to delete it in an unsupported manner. Sadly this is why we deem things 'unsupported', but as I always say, "Logs never lie, neither does WMI." Which most definitely was proven true today!