🚀 Executive Summary
TL;DR: MSPs often struggle with rigid operating frameworks that fail to address both predictable service management and chaotic project delivery. The solution involves adopting pragmatic, hybrid systems that blend essential ITIL principles for stability with DevOps practices for agility, or evolving to dedicated product teams for deeper client partnerships.
🎯 Key Takeaways
- Implement a simple Kanban board with Work In Progress (WIP) limits for immediate visibility into bottlenecks and to separate daily service desk churn from project work.
- Adopt a hybrid “ITIL-Lite + DevOps” model by selectively stealing practical elements like Incident Management, Change Management (Lite), and Service Catalog from ITIL, and Infrastructure as Code (IaC), CI/CD, and tight feedback loops from DevOps.
- Utilize a “Change as Code” approach for major changes, where the pull request in a Git repository serves as the change request and peer review acts as the Change Advisory Board (CAB) approval, ensuring repeatability and reversibility.
Stop drowning in tickets and chasing chaos. Discover which MSP operating frameworks actually work by ditching rigid dogma for pragmatic, real-world systems that blend stability with agility.
Frameworks That Don’t Suck: An MSP Engineer’s Guide to Getting Work Done
I still get a cold sweat thinking about “The Great Billing Outage of ’19.” It was 3 AM, and I was staring at a terminal connected to prod-billing-db-01, which was very, very unhappy. The root cause? A “minor patch” that was approved via an email chain with the subject line “Re: Fwd: quick question.” There was no formal change request, no rollback plan, and no one on the thread even knew what the patch was really for. Our operating framework at the time was a chaotic mix of gut feelings, overflowing inboxes, and sheer heroism. We fixed it, of course—with enough caffeine and adrenaline, you can fix anything—but I promised myself that day: never again. We needed a system that wasn’t just a three-ring binder of ITIL theory nobody reads.
Why Most Frameworks Feel Like Corporate Handcuffs
Let’s be honest. When someone says “ITIL” or “Agile Transformation,” most engineers hear “more meetings and useless paperwork.” The core problem is that MSPs live in two worlds simultaneously. On one hand, you have the steady, predictable world of Service Management: user lockouts, new PC setups, and firewall rule changes. This world needs structure, SLAs, and repeatability. That’s ITIL’s home turf.
On the other hand, you have the chaotic, high-stakes world of Project Delivery and Cloud Architecture: migrating a client’s entire infrastructure to Azure, deploying a new CI/CD pipeline, or responding to a critical security vulnerability. This world needs speed, flexibility, and tight feedback loops. This is where DevOps and Agile shine.
Trying to force one rigid framework on both worlds is like trying to use a socket wrench to hammer a nail. You’ll just end up with a broken process and a lot of frustrated engineers. The key isn’t to find the one “perfect” framework, but to build a hybrid system that works for your team and your clients.
Solution 1: The Quick Fix – The ‘Get-It-Done’ Kanban Board
If you’re drowning right now and need immediate relief, forget the certifications and consultants. Stand up a simple Kanban board. You can use Jira, Azure DevOps Boards, Trello, or even a physical whiteboard. The goal is pure, unadulterated visibility.
Your columns can be as simple as this:
- Backlog: Every single request, idea, and complaint lives here. It’s the chaotic inbox.
- Ready for Work: The team has triaged this, it’s understood, and it’s prioritized.
- In Progress: What I’m actively working on. Crucially, limit how many items can be in this column per person (a “WIP Limit”). If my limit is 2, I can’t start a third thing. This prevents context-switching burnout.
- Blocked / Needs Info: I’m stuck waiting on the client, a vendor, or another tech.
- Peer Review / QA: Another set of eyes checks my work before it goes live.
- Done: The sweet release.
This isn’t a permanent solution for everything, but it immediately exposes bottlenecks. Is the “Blocked” column always full? Your communication process is broken. Is nothing moving out of “In Progress”? Your team is overloaded. It’s a diagnostic tool and a workflow system in one, and you can implement it this afternoon.
Pro Tip: Don’t mix your “keep the lights on” tickets with your major project tasks on the same board. Create two boards: one for the daily service desk churn (Service Board) and one for value-add project work (Project Board). This keeps the priorities clear.
Solution 2: The Permanent Fix – The Hybrid “ITIL-Lite + DevOps” Model
Once you’re no longer drowning, it’s time to build a real boat. This is where most successful MSPs I’ve worked with land. You steal the best, most practical ideas from both ITIL and DevOps and discard the bureaucratic nonsense.
What to Steal from ITIL:
- Incident Management: A clear, no-BS process for when things break. Who gets called, how is it triaged (P1 vs. P4), and what’s the communication plan? This is non-negotiable.
- Change Management (Lite): Not the 10-page form and a committee meeting for every change. Instead, define change “types.” A password reset isn’t the same as a database schema migration. For major changes, we use a “Change as Code” model. The pull request is the change request.
- Service Catalog: A simple menu of repeatable requests (e.g., “New User Account,” “Request VPN Access”). This standardizes intake and allows for automation.
What to Steal from DevOps:
- Infrastructure as Code (IaC): We manage our clients’ Azure/AWS environments with Terraform or Bicep. The code lives in Git. This makes changes repeatable, reviewable, and reversible.
- CI/CD for Everything: We don’t just use pipelines for client applications. We have pipelines that deploy firewall changes, update server configurations via Ansible, and provision new client tenants.
- Tight Feedback Loops: We hold short, 15-minute daily stand-ups for our project teams. We do weekly check-ins with our main client contacts. The goal is to surface problems while they are still small.
Here’s what a lightweight, “Change as Code” request might look like in a Git repo. The peer review on the Pull Request serves as the Change Advisory Board (CAB) approval.
# File: changes/PROD-2024-034.yml
# Ticket: JIRA-TICKET-1234
# Service: ClientA-WebApp
# Engineer: darian.vance@techresolve.com
change_type: "Standard"
summary: "Increase webapp memory from 4GB to 8GB to resolve performance alerts."
risk_assessment: "Low. Service plan allows for hot-scaling. Brief restart of instances expected."
implementation_plan:
- "Merge this PR to the 'main' branch."
- "Azure DevOps Pipeline 'ClientA-WebApp-Deploy' will trigger."
- "Pipeline applies Terraform changes to update App Service Plan SKU."
rollback_plan:
- "Revert this Pull Request."
- "Re-run the pipeline to apply the previous state from Terraform."
Solution 3: The ‘Nuclear’ Option – Ditching Projects for Product Teams
This is the deep end, and it’s not for everyone. But for MSPs focused on high-value, long-term partnerships, it’s a game-changer. You stop thinking about clients in terms of “projects” and “tickets” and start treating their environment as a “product” that your team owns and improves over time.
Instead of a random pool of engineers grabbing tickets, you create small, dedicated “Client Pods” or “Value Stream Teams.”
- A pod might consist of a Lead Engineer (the Product Owner), a couple of SysAdmins, and maybe a security specialist.
- This pod is responsible for the entire lifecycle of a handful of clients.
- They manage the backlog, which includes break/fix tickets, security hardening, tech debt reduction, and new feature requests from the client.
- They are empowered to make decisions and are measured on client health, stability, and satisfaction—not just how many tickets they closed.
This is a massive cultural shift. It requires trust, autonomy, and a business model that supports proactive improvement, not just reactive break/fix billing. But when it works, you move from being a simple service provider to a true technology partner.
Which Path Is Right For You?
There’s no magic bullet. Your choice depends entirely on your team’s maturity, your client base, and your tolerance for change.
| Approach | Implementation Speed | Best For… | Biggest Risk |
| 1. Kanban Board | Hours to Days | Teams drowning in chaos that need immediate visibility. | It’s a band-aid, not a cure. Doesn’t fix underlying process issues. |
| 2. ITIL-Lite + DevOps | Weeks to Months | Most growth-oriented MSPs managing both service desk and complex projects. | Requires engineer buy-in and discipline to not over-engineer the process. |
| 3. Product Teams | Quarters to Years | Mature MSPs with long-term, high-trust client relationships. | Massive cultural and business model shift. High risk, high reward. |
The worst thing you can do is nothing. Pick a path, start small, and get feedback from your team. A framework should serve the engineers, not the other way around. Now go build something that works.
🤖 Frequently Asked Questions
❓ What are the common pitfalls of traditional MSP operating frameworks?
Traditional frameworks often fail because they are too rigid, attempting to apply a single methodology (like full ITIL or pure Agile) to both predictable service management and chaotic project delivery, leading to engineer frustration and inefficiency.
❓ How does the hybrid ITIL-Lite + DevOps model compare to implementing full ITIL or pure Agile/DevOps?
The hybrid model is more pragmatic, selectively adopting practical elements from ITIL (e.g., Incident/Change Management Lite, Service Catalog) for stability and from DevOps (e.g., IaC, CI/CD, feedback loops) for agility, avoiding the bureaucracy of full ITIL and the potential lack of structure for routine tasks in pure Agile/DevOps.
❓ What is a common implementation pitfall when adopting a Kanban board for MSPs, and how can it be avoided?
A common pitfall is mixing ‘keep the lights on’ tickets with major project tasks on the same Kanban board, leading to unclear priorities. This can be avoided by creating separate boards: one for daily service desk churn (Service Board) and another for value-add project work (Project Board).
Leave a Reply