
A recent MIT NANDA study found that 95% of enterprise AI projects fail to deliver tangible ROI. AI projects are software projects, and this made me think of historical software engineering project success rates.
More than half of software engineering projects fail in some way!
And this has been true for so long that we have grown numb to it. The cost of defects, production incidents, and code that is never used or even deployed to production was a staggering $2.41 trillion in 2022, representing over 10% of the United States GDP!
The landmark Standish Group CHAOS reports show success rates improving from 16.2% in 1994 to 31% in 2020. That's better than what the MIT NANDA study shows, but software engineering was a more mature discipline in 1994 than AI engineering is in 2025. The success rates of AI projects will improve, but as long as AI projects remain a subclass of software projects, they will be limited by the same issues keeping the software project success rates low.
There are many reasons why a majority of software engineering projects fail, but I wanted to illustrate a major one that I have observed.
Organizations overvalue uniformity in software engineering. This is often called "process standardization", "IT consolidation", "technology rationalization", etc. These things have obvious value, and are often irresistible to executives trying to reduce costs and improve quality. The downsides of the "one size fits all" approach are often underestimated, however, and many software project failures can be attributed to this, along with other things. (It's never just one thing.)
Adopting a standard process or architecture is much easier than picking a more optimal one for each project, and even that is easier than optimizing them throughout the project. Convincing people to do hard things is not easy (Eat healthy and exercise, for example.) but I wanted to share a couple of tables as an attempt to illustrate the cost of uniformity, and value in tailoring approaches/processes/architectures/tech stacks to each (and throughout each) software engineering project. They are not exhaustive or organized, just laundry lists of
The point is that everything in the first table has both pros and cons, and choosing anything before knowing the context - where the project is on the multi-dimensional spectrum illustrated by the second table - is likely to be wrong, or at least not optimal.
| waterfall | vs. | agile |
| Scrum | vs. | Kanban |
| 2-week sprints | vs. | 6-week cycles |
| Test Driven Development (TDD) | vs. | Test-After Development (or even "Production is the most productive testing environment.") |
| Continuous Integration/Continuous Deployment (CI/CD) | vs. | Change Approval Board (CAB) |
| pair programming | vs. | code reviews |
| vibe coding vs. spec-driven development vs. agent-assisted development | vs. | no AI |
| monoliths | vs. | microservices |
| serverless | vs. | Kubernetes |
| event-driven | vs. | request-response |
| separation of concerns | vs. | locality of behavior |
| Object Oriented (OO) | vs. | functional |
| programming language(s) | vs. | other programming language(s) |
| snake_case | vs. | camelCase |
| imperative | vs. | declarative |
| strong | vs. | weak typing |
| static | vs. | dynamic typing |
| compiled | vs. | interpreted |
| containers | vs. | Virtual Machines (VMs) |
| blue-green | vs. | rolling deployments |
| mutable | vs. | immutable infrastructure |
| DevOps | vs. | infrastructure as application components |
| SQL | vs. | NoSQL (Not only SQL) |
| REST | vs. | GraphQL |
| client-side | vs. | server-side rendering |
| Single Page Applications (SPAs) | vs. | Hypermedia applications |
| safety-critical (A bug may cause loss of life.) vs. mission-critical | vs. | business-critical vs. low-priority application |
| revenue-generating | vs. | cost-center |
| time-to-market pressure | vs. | long-term strategic |
| fixed-budget | vs. | time-and-materials |
| simple vs. complicated | vs. | complex vs. chaotic problem domain |
| proof of concept (POC) vs. green-field | vs. | mature vs. legacy |
| 24/7 global service with five-nines (99.999%) availability | vs. | occasionally used internal tool |
| scalable (from 0 to thousands or millions of concurrent users) | vs. | not scalable |
| milliseconds latency | vs. | several hours batch process |
| geographic localization required | vs. | not required |
| multi-jurisdictional compliance | vs. | no compliance requirements |
| general population users | vs. | specific group users (astronauts, doctors, psychometricians, children...) |
| short (3 weeks project) | vs. | long (3 years project) |
| small team (3 people) | vs. | large team (30 people) |
| collocated team (in the same room) | vs. | distributed team (in different time zones) |
| cultural and language issues | vs. | no such issues |
| experienced senior developers | vs. | mostly junior developers |
| highly specialized developers | vs. | "full stack" generalists |
| cohesive team that has worked together before | vs. | a new forming team vs. lone contributors |
| development | vs. | operations and maintenance |
| innovation | vs. | proven solution |
| startup | vs. | large, established enterprise |
| the same project at the beginning | vs. | at the end vs. at any point in between |