Tuesday, 3 December 2013

Can't Start the Lync Front End Service on Windows Server 2012

Sometimes Lync really surprises me when something seemingly innocuous can wreak havoc. 


Recently I applied some Windows updates and restarted my Lync servers.  Two things happened on the Front End after the restart.
  1. These services were disabled
    1. MSSQL$LYNCLOCAL - SQL Server (LYNCLOCAL)
    2. MSSQL$RTC - SQL Server (RTC)
    3. MSSQL$RTCLOCAL - SQL Server (RTCLOCAL)
  2. And some Lync services wouldn’t start.  Specifically the Front End service (RTCSRV).
Obviously I know I can’t start all Lync services without the CMS so I rectified that.  Made them Automatic again and started them all.  Once I did I was still unable to start the Front End service (RTCSRV).  In fact it was just stuck on starting.

I tried all kinds of things including:
  • Killing the service and restarting it.  
    • Open a command prompt as administrator and enter "tasklist".  That gives you a list of running tasks and the associated Process ID (PID). 
    • To end a task enter "taskkill /pid 1234 /f"
    • where 1234 is the process ID and /f is for force.  That kills the running task.
  • Restarting the server.  That just put me back to where I was with the service stuck on starting.
Nothing worked.

I checked the event logs and found events similar to…
Event ID: 32178
Source: LS User Services
Description:
Failed to sync data for Routing group from backup store
I did some research and found some articles that pointed me to the MS support page. http://support.microsoft.com/kb/2795828.  This described the error.  The cause and what to do to fix it.

It seems all of a sudden Lync didn’t like the fact that there is a non self-signed certificate installed in the Trusted Root Certificate Authorities store.

I ran this command PowerShell to find the certificate that the service didn't like.
Get-Childitem cert:\LocalMachine\root -Recurse | Where-Object {$_.Issuer -ne $_.Subject} | Format-List * | Out-File "c:\users\administrator\documents\computer_filtered.txt" 
  1. I opened the MMC and added Certificate Manager under computer account local computer. 
  2. I found the certificate displayed on the text file output from the command.
  3. I dragged the certificate from Trusted Root Certification Authorities to Intermediate Certification Authorities. 
  4. I tried killing the Front End service and starting it again and it still didn’t work.
  5. So I restarted the server again.  This time all the services started.
Luckily this was only a lab/demo environment rather than a live customer environment.  I know you wouldn't usually apply Windows Updates during the day.  But what if you came back in the morning after updates were applied and found that nobody could sign into Lync.

Anyway.  As always I hope this helps.