Windows Server Patching Procedure
Hello All,
I have a few questions about installing Windows security updates on Windows servers (we call it as patching here).
Before asking questions, I would like to give a brief details about our Windows infrastructure and the procedure we follow for patching.
- We have about 550 Windows servers (a few of them are Windows 2003 and rest are 2008) in our domain
- We patch these servers twice a year (July and December)
- We use LANDesk patch Manager for the patch deployment
- LANDesk Provisioning Templates are useful in case of server reboots in the middle. Details are at http://community.landesk.com/support/docs/DOC-9485
- We follow the below patching schedule (in phases)
- All QA/DEV/TEST servers in one phase (over a weekend)
- All PROD servers in two weeks (over two weekends) with a gap of a week
- Once the patching is complete in each phase, we verify a few critical applications/services to make sure they are up. We take the end user help in testing their applications if required.
So, my questions are below:
- Though we verify a few critical services, in the next business day of patching we get a few user calls about their applications not working. We will identify the services and bring them up.
- I am wondering how this patching activity taken care in the rest of the world, so that we can improve our process
- What procedure will be followed by the companies with a few thousands of Windows servers? Is there a better way we can ensure that the Windows server health is not changed after the patching activity (as part of which the server might be restarted a couple of times).
I am really sorry for such a lengthy post and thanks a lot for reading it. In case if anyone has any suggestions/inputs on this would be greatly appreciated.
Answers (1)
I'll give you a few suggestions...
With regards to your first point about receiving calls the next day - perhaps you can look at why your previous admin and user testing didn't catch these. Build these particular services into your checklist to look at the next time.
Generally speaking, however, your approach is relatively standard - with the potential exception of the frequency. Servers tend to have application or business owners, who should assist and be responsible for business checkout after changes such as this. While your desire to catch all problems before they are reported is admirable, it's not always 100% achievable.
With regards to a few thousand servers, I'd assume they have bigger staff and similar methods - just more scrutinized.
May also be a good idea to give this book a once over:
http://www.amazon.com/Curing-Management-Headache-Felicia-Wetter/dp/0849328543
There is also a great mailing list that deals specifically with patch management, which you may be able to search through the archives or get feedback from there as well -
Comments:
-
Thanks a lot for going through my long question and for the helpful reply, Sir. While the user calls after the patching activity is relatively very less (at max 2-3 calls for about 500 servers), but I was actually looking for an automated way to make sure the servers are in a healthier state after patching and multiple reboots. One way is to check for the services running state, but I that may not give a 100% accurate results. Unfortunately, we are looking at 99.99% accuracy here (please don’t laugh at me ;-)). And thanks much for the other resources, I'll go through them now.
Thanks again for the quick help!
Regards,
Chetan - ChetanKumarT 11 years ago -
I said 99.99% success rate because we had a few previous experiences where 1% failures showed negative impact over 99% success rate :-( - ChetanKumarT 11 years ago
-
No problem - and 5 9's (99.999%) is always desireable. If it's as simple as services not starting, I assume you mean after a reboot? If that's happening consistently - I think you are better served to figure out why they are not starting, as opposed to just checking for the state and starting if needed. - drose23 11 years ago
-
Yah...we are actually planning to write a PowerShell script to verify server health by checking the services' status, errors in eventvwr...etc. I'll post more details once we have the script ready and if it's useful. Thank you! - ChetanKumarT 11 years ago
-
we do our patches every 3rd week called change Friday and push patches using Tanium during patch and reboot process - lefty80 11 years ago
-
Just to be clear, I was talking about the Windows *Server* patching in my original question. Do you patch Servers every month? - ChetanKumarT 11 years ago