How To – In Place Upgrade

, , ,

Oh this has become a recent favorite topic of mine. There has been some great logic thrown out there to the community from the Wells Fargo dream team of Mike Terrill, and Gary Blok on their websites. If you are active on twitter follow these 4 accounts @AdamGrosstx, @MikeTerrill @GWBlok and @SCCMF12TWICE the 4 of us are frequently consulted on many things related to OSD, IPU, and Pre-Caching. Also, make sure you tell Mike to write his WaaS stuff already….that way I can read it and sound really smart when I am talking to customers!!! Anyways, let’s start the topic. I’m currently helping a customer out in my spare time with their Win 10 IPU scenario. What I am working to deliver with the short time constraints is a documented rinse/repeat solution for their future upgrades….until auto-pilot happens! The customer had previously had their MSP do the work, but uh let’s not go there.

This blogpost will give a high-level overview on how the entire IPU will occur. Future blogposts I will go into more detail on the specifics, and link you to where the original logic is from (if not my own). A bunch of the things I am working on at delivering in this customers environment is done with my friend and super talented IT Architect Chad Arvay who is legit AF on the intune/autopilot/co-management side of things. So look forward to him and I bringing some great content to the community for that in the near future.

At this customers site I am running with this approach.

 

Pre-Ground Work: Identify current OS Level of environment > and OS we are going to “Upgrade” or “Migrate” too. In this customers case we are moving from 1511 to 1703, and then potentially after to move to 1806

How to Select potentially Ready System: In the customers environment we are deploy the compat scan package against all 1511 systems as required. The people over at SystemCenterDudes have a great how to document for creating this package. This will be configured to Run from the DP. Users should not experience any problems while this scan is running. Once the systems run the scan, and report back data we can now better visualize the success/fail rate of systems to be upgraded in migration candidate dashboard which we will show shortly. I have deciphered a few more of the compat scan results that what I see out there in the community and will post the sql for that in a future post as well.

From the SCCM Admins perspective this action takes place from the 1511 collection, and if successful will move forward to the compat scan pass collection like below. I have all my collections numbered so you can easily see the process of what order things happen and where systems will go. We will get into the collection design/query in a future blogpost so you can make the magic happen too!

SCHEDULING SYSTEMS  -> DEPLOYING SYSTEMS

  1. The PM/Scheduling team will run the dashboard and select systems that are approved for deployment. Some environments will do this based on location/team/availablitity just use whatever suits your environment the best.
    • Identify Systems that are ready/not ready for upgrade based on criteria provided in the “Windows Migration – Candidates” report.
      • Use criteria provided in Migration Candidate Dashboard
        • Approved model for upgrade
        • Enough Space in Nomad Cache (15%)
        • Enough space on the system (30GB)
        • Upgrade Readiness compat scan results
        • Client Health
        • HW/WMI returned in 14 days
        • Reboot pending
        • Client Health evaluation

 

  1. The Deployment Team will receive list of systems for deployment and add to “3 – Migration Tracking List” collection. This collection is what we will use after the TS is finished as part of our “Windows Migration – Candidates” Anything in this collection will also populate the “3.x – PreFlight Tier 1” collection. This is the collection where the members will start to get updated client settings. In this collection is where the client setting is forcing check-ins to be more frequent, so policy gets to the systems, and i get data quicker! The customer should not notice much change (can still install apps, etc). This is also the collection where I have compliance items, and have some scripts ran against the system. This is the ground work for making sure enough space is cleared out on the system to start the TS. I have a 3 compliance items against this collection. They will 1. clear the CCM Cache of old content, clear out space on system (temp locations etc), and change Nomad Poling settings. The script I have deployed here will clear the Nomad cache of all content older than 30 days…please note that when I clear content from the Nomad Cache this will remove the same pacakgeIDs from the CCM Cache just in case you did not know…now you do!
  1. Systems will also populate the “4 – Begin Nomad Cache Content” collection, and also populate the “4.x xxxxx” model specific collections to get additional driver packages. Once all the content is “Precached” the system will be automatically added to another collection. In total the systems will require a specific 19 package IDs in order to start the TS later in this overview.
    1. NOTE: if a system does not show up here it is b/c of the “Safety” I have built in place with the compat pass collection. This means that the system either did not run, or it failed the compact scan. This should not happen in production if the criteria is provided
      NOTE: I will go into more detail on the Nomad Jobs, Client Setting to track Nomad Jobs, Compliance Item for Nomad Polling, SQL to track that logic, Collection (this is custom SQL that you don’t have a “dropdown” to choose from for the collections)
  1. Once the systems finish getting all 19 required pieces of content in this environment then, and only then will the system move to a “5.x model specific” collection. These different approved models, that are fully cached make up the “5 – Win 10 1703 UPGRADE DEPLOYMENT” it is in this collection where the shortcut icon is placed to start the TS, a scheduling tool is called, and the TS is available with a deadline date. NOTE: I tried to make all models work within 1 collection, but I just couldn’t make the query update efficiently so that is why this is currently broken out this way. 
  2.  Once systems are in this collection the users will have the desktop shortcut icon to do the upgrade.

I can do a future post on how to do shortcut icons to call a TS to kick off and how to clean that stuff when I get a minute. I do not necessarily like the shortcut icon, but it is convenient for users to be able to star the TS by clicking on it. Just be aware that users can easily miss the shortcut, or accidentally start the TS this way. ALSO when systems are in this collection There is another deployment which will call the “schedule tool” This will display on top of every item on the customers screen to perform the upgrade, or to schedule The upgrade. You can customize your own messages and configurations.This tool allows the user to schedule down to the second when they want to upgrade. Customers can also use this to communicate a location where they can read more on the process, or go ahead and start the upgrade now. The guys over at OneVinn came up with this tool here. This tool is great because of a “Nag factor” that displays the message on a scheduled basis above everything on your desktop. This tool should nag enough ppl to star the upgrade sooner than later!

  1. A few notes about the TS…the TS will first verify there is atleast 30 gb free space available, and if there is not enough space the TS will work to clean up more space. The TS will Uninstall/upgrade any non-compatible security app and then perform the upgrade. The upgrade will then install the latest driver package, and a few key apps. After this is complete the TS will then clean up space on the system before completing. This is all done in a high performance setting to speed up the process. I have a bunch of things that I was doing in this TS that I saw Gary do wayyyy more efficiently so I swapped some of my logic out to use Garys method. There is also the Console Extension from OneVinn for the collection move stuff so make sure you check out their stuff on technet too! I won’t go much in depth in future post on how my TS is designed but this is a quick screenshot for the curious

  2. NOTE: if you are curious about how to track OSD via SQL you can read about it in my blogpost here
  1. Once the systems runs the TS the logic will tattoo the systems with more data specific to the TS so it can be more accurately tracked and aid in future migration plans. Such data we will now get back is how many times the system attempted IPU, how long the IPU took, etc. That registry location will be HLKM\Software\WaaS\XXXX. I won’t provide a screenshot here b/c its way more limited than the way Gary uses it for his environment b/c we have different approached so Just check out his website for this. NOTE: information that is captured within this registry location will be added to a “Migration Status Dashboard” this is currently being drafted but once it’s done I will update this blogpost, and possibly release the core SQL code for the communities use but this will require you to extend client settings so just a heads up on that.
  2. At completion of the TS, systems will be moved to a “Completed”, or “Failure b/c” collection depending on exit code. The majority of failures for now will be recommended for bare metal deployment. In most remote cases where all the content is pre-cached we expect the upgrade process to take 45 minutes or less. I believe It is best for us to communicate the TS to take an estimated  2 hours to complete to the end customer.
  3. When the system finishes the TS all the data is reported back, and will be available in the “Migration Tracking Dashboard”. Among other things this dashboard will have a chart tracking info for systems upgraded in last 2 weeks, pie chart to identify systems that bare metal vs upgrade to latest approved OS, etc. Some things I will have on the dashboard look like this but will include more logic such as how long each system took on avg, how many attempts were made by system, etc. I’ll add the final screenshots later when the SQL/SSRS is ready but for now here is a sneak peak….this is an amazing report, trust me I know plenty of reports but this one is AMAZING!Obviously I have redacted stuff from some of these screenshots but you get the idea.

In short the Upgrade Sequence – Pass Scenario will look like this:

I have already started writing the follow up blogposts for how to do the collections, SQL logic, compliance items etc. So keep checking back for how you can get this going in your environment.

Reduce OSD Time

, , ,

Reduce OSD Time

1. Set High Performance Power Plan

  • This has proven in multiple environments that I have been to save 20% – 50% on certain steps, such as download/apply the OS. Also note that by default configmgr while in WinPE will run power setting set to balanced, so making sure we use all resources we can see a major win from this piece. I also set this to prvent systems from going to sleep during the TS.

2. Set SMS Host Agent start-up options

  • By default the SMS Host Agent (CcmExec) service is set to a delayed start. So every time you restart the system during the OSD Process and you see the “initializing System Center Configuration client” you are just costing time. In my environment this message would usually last 2 – 5 minutes each restart. Once I implemented this we noticed the configmgr client would only take < 15 seconds to initialize, that is a savings of 800% – 2000% for this specific task sequence action.
  • Frank Rojas whom is one Microsofts CSS Lead OSD Engineer has let me know that this has POTENTIAL for failures in a TS. So please know that if you use this step please understand it is not supported by Microsoft. I have not personally seen any problems with this, but I do understand there is possible for OSD failure as a result.

3. Set SMSTSRebootDelay

  • This will change the default behavior from 30 seconds to 0 seconds and occur immediately
4. Set manual Restart times (for displayed messages)
  • By default when you add a restart computer step into the TS the time is set to 60 seconds. Simply by un-checking the “Notify the User before restarting” box you now force the restart immediately. Should you need to actually display a message you can just reduce the amount of time this box is displayed.
5. Eliminate unnecessary restarts / steps
  • I have seen several customer environments that had 8 or more restart steps in their TS. Whenever you have a restart in a TS you can assume usually 2 – 5 minutes are added to the total TS time. I would recommend investigating the status messages for exit codes indicating a restart is necessary before you start modifying a customers task sequence. Many of my peers usually have only 2 or 3 restarts in their TS as they have streamlined their deployments.
  • MVP Mick Pletcher has a great post on how to handle restarts in your TS. Basically add a condition statement to check if the system needs a restart you can read about it and implement it here
6. Patch the WIM
  • This will of course help you out on the install software update steps in your TS. The more patches that are in your WIM, the less to process later. I tend to bake office and patches into the WIM. I’m not a fan of having really thick images loaded with several 3rd party apps that can be installed during ts.
7. Verify boundaries are efficient
  • This should be another no-brainier idea, but you will be surprised. I’ve seen customer environments where a machine from the east cost USA would pull down images from the middle east….like there is literally an entire ocean btwn the DP and the workstation..
8. Prevent OOBE From connecting to Windows Update
  • I’ve seen in the SMSTS.LOG before that a system I was imaging was trying to reach out to windows update. Some places you are confined to image from certain vlans only and you can block this communication at the firewall. In my TS I modify the Registry to prevent this action from occurring. Unfortunately this was something changed in my TS a long time ago so I did not grab a screenshot. I can edit this post when I get around to making the next revision of changes in the updated WIM. (Thanks Todd)
BEFORE MODIFICATIONS

On my test experiment the OSD process would take around 100 – 110 minutes…that’s nearly two hours, and wayyyy too long.

 

I like to use Thomas Larsens OSD Dashboard when I keep track of some Task sequence information like this. Please check out his blog here. I believe this dashboard is a must have for every SCCM admin.
 
AFTER MODIFICATIONS

Screenshot after changes OSD Times. We see below that this has been reduced to 36 minutes. I am positive I can actually get this to 32 minutes or less, but for now I will leave good enough alone. This test shows that we have cut the OSD time to less than 1/3 of the time. I’ve been in other environments where they had the image process take 24+ hours, but that is mainly boundary related issues.

 

TS MODIFICATIONS

 

Power settings: I kind of have this littered everywhere in my TS before major time consuming steps like apply OS, install patches, install apps, etc.

Name: Set High Performance Power Scheme
Command Line: PowerCfg.exe /s 8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c
 
SMS Host Agent Start-Up Properties: This only needs to go in 1 place right after you install the SCCM Client. 


NameSet SMS Host agent to start immediately
Command Linecmd /c sc config “CcmExec” start= auto







Dynamic Variables: I have all my variables set here. These are for resiliency, and speed.

 



Prevent OOBE from Downloading Drivers:

NameMount the Offline HKLM File
Command Linereg load HKLMOffline C:WindowsSystem32ConfigSoftware

NameSet DODownloadMode
Command LineREG ADD HKLMOfflineSOFTWAREPoliciesMicrosoftWindowsDeliveryOptimization /v DODownloadMode /t REG_DWORD /d 100

NameSet DoNotConnectToWindowsUpdate
Command LineREG ADD HKLMOfflineSOFTWAREPoliciesMicrosoftWindowsWindowsUpdate /v DoNotConnectToWindowsUpdateInternetLocations /t REG_DWORD /d 1

NameUnmount the Offline HKLM File
Command Linereg unload HKLMOffline





End of TS to change back to default (if wanted)
 

 

NameSet SMS Host agent to start delayed
Command Linecmd /c sc config “CcmExec” start= delayed-auto
 

Name: Set Balanced Power Scheme
Command LinePowerCfg.exe /s 381b4222-f694-41f0-9685-ff5bb260df2e




 

NOTE: 
This customer has unnecessary restarts in their OSD design, and were not very streamlined. Things in this version of the TS can be easily baked into the WIM on the next image refresh with the build/capture TS i provided for the customer. There are things that are slow during this TS like install Visual J+, or .Net Framework 3.5 that would reduce the OSD time if baked into image. 

Other things to reduce OSD Time: 
I still have unnecessary install software update/restart steps in the TS as a sort of catch all. These systems come out fully patched, but are left enable in case I do not get time to capture and test a new wim one month.

Prevent
I have seen in the SMSTS.LOG where systems would try to reach out to windows updates for patches instead of staying within my environment. This can be changed via registry modification…I have this in place in my environment but did not capture screenshots when I saw it in the log files. 

SSRS – Client Health Dashboard

, , , , ,
Client Health Dashboard
In previous posts I was speaking about soon to be releasing a series of SSRS reports based on troubleshooting. The first in the series I am trying to have is a focus on Client Health Overview. This dashboard will later be include drill down functionality to multiple other reports as soon as I can finish making and validating them. This dashboard is something that I put together for my current customer to get a brief overview of client health. There will be a few more items added to the home page at a later time which will be focused on count vulnerable/unpatched systems
Currently we can see in this report Active clients, MP information, Client version counts, OS version counts. There are even a few top level items you should keep an eye on like duplicate systems, mac addresses, systems running out of space, or why clients failed to install/re-evaluate. These will all be further expaneded on when the full client health troubleshooting series wraps up over the next few weeks.

Link to the Report: https://gallery.technet.microsoft.com/SSRS-Client-Health-6bfb794f

Some code used from: Greg Ramsey and Eswor Koneti

Future Dashboards w/ drilldowns coming soon.

– Client Health (will be finished soon)

  • This will also include drill downs into several other reports

– Software Deployment (For Packages / For Apps)

  • This will also include how to troubleshoot just like this other report
– Infrastructure health (Primary/DP/SQL/DP etc health focused)
  • This will also include how to troubleshoot just like this other report
  • This will also include drill downs into several other reports
– Windows Migration Summary

 

– Collection evaluation