FIX – SCCM Upgrade Issue – SQL Based Replication

3
SCCM Upgrade Issue - FIX

My name is Deb and this is my first post in this blog. I will try to share my experience with SCCM, OSD, and Task Sequence more frequently with you all. Recently, I faced an issue during SCCM CB in-place upgrade. I will try to explain the issue and resolution in this post. SCCM CB upgrade from version 1802 to 1806. Normally, the SCCM upgrade process is straight forward but this time I faced multiple issues to complete the upgrade.

SCCM Upgrade Issue

In SCCM environment, we have SCCM CAS server and mutiple primary servers. During SCCM CB upgrade, we found that monitoring status was not changing for few Primary sites and it was showing waiting.

I observed that SCCM CB upgrade status for few hours like 4-5 hrs, but the status was unchanged on CAS console – monitoring workspace – Upgrade status. When we checked the SCCM primary site, I could see the primary site has been upgraded successfully.

SCCM Upgrade Issue
Monitoring Workspace – Replication Link Status  

But the status of successful upgrade of primary server was not reaching SCCM CAS server. Because of this status issue, the SCCM upgrade was not proceeding further to the next stage.

Troubleshoot – SCCM Upgrade Issue

The first of troubleshooting was to go through console monitoring status. I checked the SCCM DB replication link status on the CAS site. SCCM SQL replication was failed between CAS and those primary sites. 

I found that SCCM DB replication link was in failed state and that is why the primary server SCCM upgrade status was not able to reach CAS server. I rebooted the CAS server and SQL server to check whether there is any help with that or not, but no luck. Still the issue was unchanged.

SCCM Upgrade Issues - SQL Replication Failed
SCCM Upgrade Issues – SQL Replication Link Failed

I started troubleshooting the SCCM upgrade issue with SCCM logs. I couldn’t find any errors in the sender.log and other logs. I checked the logs on CAS server like sender.log and rcmctrl.log etc. One interesting error message I found in rcmctrl.log for one of the primary server. 

Error: Failed to create drs init stored proc SMS_REPLICATION_CONFIGURATION_MONITOR 12/17/2018 1:33:06 PM 4576 (0x11E0)Error: Exception message: [Cannot drop type ‘SCCM_DRS_AlwaysOn_Server_MonitoringStatus_TYPETEMPbecause it is being referenced by object ‘spUpsertAlwaysOn_Server_MonitoringStatus’. There may be other objects that reference this type.] SMS_REPLICATION_CONFIGURATION_MONITOR 12/17/2018 1:33:06 PM 4576 (0x11E0)Error: Could not create Drs initialization stored procedures or functions for table AlwaysOn_Server_MonitoringStatus, will retry on next cycle. SMS_REPLICATION_CONFIGURATION_MONITOR 12/17/2018 1:33:06 PM 4576 (0x11E0)

SCCM SQL Replication – SCCM Upgrade Issue

Further troubleshooting of SCCM replication was done  using the following post “SQL Replication Troubleshooting Guide“. 

As a next step, I tried to drop .PUB files on CAS server rcm.box to forcefully initiate the replication of particular primary server site data. Eg: hardware_inventory_3-<primary site code>.pub. 

The above step didn’t help me to resolve the SCCM upgrade issue and it’s still stuck. I raised support case with Microsoft CSS. as I didn’t had enough time to troubleshoot because we should complete the SCCM upgrade within the within the change window.

Moreover, there was no mention of this issue in any of the technet forums so it seems to be a SCCM bug or particular issue with our SCCM environment.

Resolution FIX – SCCM Upgrade Issue

Microsoft analyzed the SQL database, logs, and monitoring workspace of our SCCM infrastructure. Microsoft engineer found that rcm.box folder size is also more because there are so many files are present on rcm.box folder which was not getting removing.

Microsoft also tried to drop .PUB file and they got the same error as I mentioned aboce in the RCMCTRL.log file. Checkout the below section to know What is this .PUB file?

Post analyzing the rcmctrl.log, MS found that .PUB file was not able re-initiate the sync because, there was a change in the SQL  database store procedure. It is changed to ‘SCCM_DRS_AlwaysOn_Server_MonitoringStatus_TYPETEMP

SCCM Upgrade Issue - FIX - Stored Procedure Change

SCCM Upgrade Issue – FIX – Stored Procedure Change

How to change it back to the orginal stored procedure name ‘SCCM_DRS_AlwaysOn_Server_MonitoringStatus_TYPE‘?

  • Open the SQL Management Studio with SQL Admin access
  • Search for the stored procedure called “SCCM_DRS_AlwaysOn_Server_MonitoringStatus_TYPETEMP”
  • Open the Stored Procedure and change
    • SCCM_DRS_AlwaysOn_Server_MonitoringStatus_TYPETEMP‘ to ‘SCCM_DRS_AlwaysOn_Server_MonitoringStatus_TYPE‘. (as you can see the above screen capture)

Microsoft support engineer took the backup of SQL DB and changed the SCCM stored procedure name to fix the SCCM upgrade issue. 
SCCM_DRS_AlwaysOn_Server_MonitoringStatus_TYPETEMP‘ to
SCCM_DRS_AlwaysOn_Server_MonitoringStatus_TYPE‘. 


SCCM_DRS_AlwaysOn_Server_MonitoringStatus_TYPETEMP‘ to
SCCM_DRS_AlwaysOn_Server_MonitoringStatus_TYPE‘. 

After modifying the SP, replication issues got fixed and CAS monitoring status got updated with correct status of primary server and the SCCM upgrade got successfully completed.

Note: I would recommend don’t update/modify directly anything on SQL DB without MS recommendation. If any error you see on SQL then my recommendation is to raise a ticket with MS and get it fixed.

Result – SCCM Upgrade Issue Fixed

The above change in stored procedure helped to change the replication links active between primary server and CAS. Once the replication link was active, the SCCM 1806 upgrade issue got fixed automatically. 

What is SCCM .PUB file?

If you are an SCCM admin for several years you might know .SHA files. PUB files are similar to .SHA files. The SCCM .PUB files are helpful to manually initiate the SQL resource group replication sync without changing anything in SQL DB or without running any stored procedures like “spDrsSendSubscriptionInvalid“. 

I don’t recommend to use “spDrsSendSubscriptionInvalid” because this SP will initiate the re-replication between the sites and may cause of lot of Network Traffic and other issues.

Resources

3 COMMENTS

  1. Hi Deb – It’s very useful and helped me to resolve sccm secondary server upgrade issues … I feel if your secondary server upgrade is not completed as per the console but it got upgraded as per the logs ?? Yes then the way is to check sql replication issues between sccm primary and secondary servers .
    Thank you

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.