Journey of a Contemplative Architect: January 2020

Saturday, 18 January 2020

Dependency Injection - The Pros and Cons and the Way to Utilize It the Right Way

I'm quite opinionated on the topic of dependency injection but my views were mainly formed years ago. Some time ago I encountered an article questioning the usage pattern and that got me thinking of refreshing and clarifying my views - which then lead me to this absolutely excellent conversation on the topic in Stack Exchange: https://softwareengineering.stackexchange.com/questions/371722/criticism-and-disadvantages-of-dependency-injection

This is my take on how to get the most out of dependency injection.

NOTE before we start - by referring to DI (dependency injection) I'm referring to the pattern - not to any specific implementation. It's possible to do DI entirely manually although most often a library or framework is used where a centralized DI container manages the dependencies' lifecycles.

First of all - what is it?

Dependencies are injected from external context e.g. instead of creating instances of other classes they're passed in via the class constructor (or in some cases private attributes - more on that later)
Class focuses only on knowing its own implementation details and lets other components handle the lifecycle management for the dependencies instead of trying to do it myself

What are the benefits and how to use it?

Makes it easy (or at least possible with sensible effort) to unit test the class

This can be contrasted to the situation in many legacy projects where calls to static methods are made that further execute file or DB operations etc. and break if tried to execute without the fully set up context (without doing some pretty fancy test mocking)

Explicitly define the dependencies of the class (or other similar abstraction) making it potentially more readable
Facilitates good design - when used correctly promotes SRP (single responsibility principle)
One of advantages of DI is making the dependencies explicit and clear. There's no point in listing ILogger as dependency of nearly every class - it's just clutter

When is it relevant?

When invoking dependencies which have side effects (which for most traditional software projects written in C# / Java / etc. is most dependencies)

It's the side effects which need mocking in the unit tests and that also need to be much more explicitly managed
Meaning also that for code that is purely functional (does not have any side effects) using DI adds complexity without any payoff

Common problems with DI containers

Easily makes code less readable by making it unclear where the call actually will get routed
Sometimes creates "interesting" hard-to-find memory leaks when an automatically managed object's lifecycle has been slightly misclassified

Anti-patterns - How NOT to use it

Do NOT create extra superfluous interfaces just because DI pattern demands it

All good unit testing frameworks nowadays enable classes to be treated in a similar manner for mocking obviating the need for the interface
Superfluous interfaces just make the code less readable, harder to jump between points in the code and harder to understand the overall flow

Using property injection - pattern used by some DI frameworks e.g. Spring. Basically mandatory when dealing with EJBs

This means that private variables of the class get set by the DI meaning that the class can be in an inconsistent state
Unit testing essentially requires using the container in question making them more cumbersome, less explicit and understandable etc.
Contrasted to constructor injection where after constructor call finishing successfully you can be quite sure the class stays in a consistent state (exceptions being cases where a direct dependency gets deprecated causing need to re-create the consumer object as well - this can be addressed via better design)
And the worst: makes it nigh impossible to use immutable classes

Breaking the classes into too small pieces without a driving design reason for it - and registering all the little classes in the container

Easily ends up exposing APIs and classes that should actually be internal implementation details within a larger class encompassing a coherent sub-domain

So how to get the best use out of it?

KISS principle - keep it as simple as possible
Make everything as explicit as possible (while balancing it with the amount of boilerplate)

Stay away from aspect oriented programming - it might be somewhat decent with good enough IDE support but I have yet to see it (you'll just create headache for yourself by making the functionality of your program less explicit)

Mentioned because DI containers are often a way by which AOP is implemented

Constructor injection only - and assign to private final

Works very well with immutable / functional style of programming: https://contemplative-architect-journey.blogspot.com/2019/12/declarative-programming-simplicity.html

Avoid any complex rules in DI setup - and avoid too much implicit logic ("it's magic")
Never create an interface unless there's an explicit need for it from other reasons (i.e. there actually are multiple implementations for it or it adds very concrete value in being more understandable that way)

And along with this - when you do need to use interfaces, use the most specific (i.e. avoid overtly generic ones like IDisposable unless that's explicitly required)

Don't DI purely functional calls (although this requires certainty on which calls are like this and which have even potential side effects)
Always aim for code you can understand just by looking at it - meaning that DI shouldn't cause surprises in how it injects a dependency. This is related to keeping the injection rules as simple as possible. Avoid spooky action at distance
Do not ban creating instances outside of the DI framework (there will be immutable value and other classes, data classes, etc. which don't fall under DI anyway)

Some random insightful notes

"If code has no side effects, DI is just useless complexity"
"Should we create superfluous interfaces to satisfy the demands of the pattern? Absolutely not"
Note that previously (working with monoliths) it was an issue that a large software component's all dependencies might easily get registered in the same central DI container causing leakage between sub-domain boundaries etc. With the rise of microservice architectures this is much less of a problem since a microservice itself is supposed to implement a specific sub-domain and be of the scale where it's very natural for (at least major) dependencies to be visible to each other within the implementation thus obviating the need to start considering segregation strategies for the DI container.

Related: https://contemplative-architect-journey.blogspot.com/2020/01/lets-talk-about-microservices.html

Saturday, 11 January 2020

Let's talk about Microservices

Microservices are a big topic nowadays and for a good reason. Being so talked about there are also a lot of misconceptions (and truly differing opinions) on it. I started writing my take on it but then ended up finding a conference talk that summarizes the current situation so well that this ended being pretty much another summary with some of my related commentary included. The referenced conference talk by Tomer Gabel on the subject titled "Microservices: A Retrospective" (https://www.youtube.com/watch?v=DLRfT44e8uQ).

First it's best to clarify what I refer to when using the word "microservice". Best go by Martin Fowler's definition (https://martinfowler.com/articles/microservices.html). There are often questions and misconceptions about the size of a service when it's a microservice. The architectural style does not mandate that the services absolutely need to be a certain size - it's more about the bounded context - the ideal structure so that each service has a clear and minimally overlapping role with other services in the architecture. This may mean that one service only has a few hundred lines of code while another might have ten thousand. This is tackled from another perspective below.

Why microservices

Developer velocity

Independent releases
Limited scope of work
Stronger decoupling
Requires well defined bounded context

Scalability

Independent scaling
Independent storage
Limited scope is easier to optimize
Requires well defined bounded context

Resilience

String error boundaries
Partial failures are possible

Though requires approach where upstream failures are prepared for in the system design though
Still a lot easier than in a monolithic application where there is high and nigh-unavoidable interdependence at multiple levels

Secondary benefits

Polyglot
Easier to test at least in isolation

Most important reason: enabler for organizational scaling

When successful organization grows there will be more developers, teams, products, visibility (small audience = partial failures may not even be noticed but very different with millions of customers), liability, responsibility.
While growing products become interdependent and so do teams

Incurs synchronization cost

Four key metrics of high-performing organizations (https://www.thoughtworks.com/radar/techniques/four-key-metrics)

Lead time
Deployment frequency
MTTR
Change fail percentage
They're all negatively affected by synchronization

Example of synchronization cost is when an issue occurs and first you need to figure out who owns the issue and should start the troubleshooting
So a high level objective should be to minimize synchronization, maximize independence and microservices are a great fit for this

Lessons learned

Small is good - it's much more efficient when developers, designers, etc. can focus on a small section of the overall architecture and reason about it without the details of the rest bleeding through
Smaller interfaces

Easier to reason about
Easier to evolve
Hard to keep small though
Results in lower coupling which in turn supports the minimization of synchronization cost

What makes microservice micro

Not the amount of code but rather minimal API surface area
Also the reason why you should never share the data source between two services - SQL API is massive

Polyglot architecture is enabled by microservices

Great promise of microservices, enables multiple tech stacks
But incurs significant cost when number of different stacks in a company is too high by making it costly for people to switch teams, transfer service responsibility,
Number of languages even in many top tech firms is rather limited:

Google 6
Facebook 9
Twitter 4
Amazon 3
Netlifx ~3
Spotify 4
If you have more than one or two tech stacks for five thousand engineers you're doing it wrong

There's a dichotomy between individualistic point of view for always selecting the best tool for the job in that situation and for keeping the overall architecture consistent
Even organizations that initially enable total freedom of individually choosing the best tools based on developer preference end up consolidating to a smaller number of different stacks later
Smaller number of stacks makes mobilization between teams easier, easier hiring
For any regular organization 2-3 tech stacks is what you should shoot for

Conway's Law is true - "Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations"

Means that you want to structure the teams in a way that supports the architecture you want and vice versa
Common danger signs:

One data store owner by multiple services
Single system or domain or responsibility owned by multiple teams

Single responsibility principle SRP - if you're constantly violating it

Reorganize your services or your teams

Operations matter

Developer velocity = ship fast and strong
No bottlenecks allowed:

No global release

Instead many small rapid iterations

No centralized ops

Distributed ownership, devops

No manual deployment

Full automation deployment

No static topology

Ephemeral, autoscaled, serverless

Automation is key - everything as code

Infrastructure as code
Automated deployment and CI/CD pipelines as code
Automated monitoring, metrics, alerts
Remediation i.e. automated rollbacks, A/B testing, canary releases, etc.
Provisioning
Automation empowers your teams
Tools are still developing but at least we know what to shoot for

All of this supports minimization of synchronization cost

Criticism

One often raised (and to a degree valid) criticism for microservices is that by making the services smaller we end up just moving some of the complexity from the code level to the architectural level

This is indeed a problem when the higher level does not provide tooling to address the complexity better
This is why I wouldn't really recommend full microservice architecture prior to going full-on Kubernetes since I see K8S being a huge jump forward in the tooling to manage this complexity (better than is often practical at code level for monoliths)

What we haven't learned

Insufficient knowledge of the "physics of distributed systems" e.g. CAP theorem,

We're all building distributed systems and concurrency is key
Disregarding CAP means

You will end up with inconsistent data
You will not scale
You will lose data

Observability

Distributed systems are by nature

Disaggregate
Hard to reason about
We are still deficient in tooling and methodoloy

Tooling is getting better

Tracing (Jaeger, Zipkin)
Metrics (Grafana, Prometheus)
Log aggregation (ELK et al)

But that's not enough when you can't debug in isolation
Have to know

What to log
What to count
What to monitor
How to make sense of it all
There are no easy answers
E.g. we're still counting average response time

Why average response time is a bad metric was summarized nicely in a comment to the video: "Response time usually has a long tail distribution, and the mean/average value of that does not really tell too much. I can be high because there were some very long requests or it can be high because all requests started to take longer. Instead you can use percentiles which tells you what portion of requests are faster than their value. For instance, P50 (median) tells that every second request is faster than its value and every other second request is slower than its value. For samples with normal distribution median is the same as mean/average. Of course you can split the distribution at any arbitrary point (e.g. P75, P90, P99). The downside of percentiles that you require all the samples at one place to compute them, although there are algorithms (e.g. TDigest) that can give you good estimates in distributed environments."

Recap

We aim to

Minimize synchronization
Maximize independence

We're struggling with

Safety
Scalability
Observability

The goals are met and issues alleviated via event-driven architectures

Events make everything simpler (easier to build, easier to reason about, easier to test)
Modeling interactions with events

Enforces strong context boundaries
Lets you scale services independently
Lets you observe the system in motion
Increases system reliability

Persisting event streams (event sourcing)

Observatility and auditing built in
Lets you scale use cases independently (CQRS)
Precludes full consistency (good thing)

Outside of the narrow field of software engineering, virtually nothing in life is fully consistent, no business is fully consistent
To get availability we have to very often give up some consistency

E.g. bank account's balance as a series of all transactions in the history in it

Modeling workflows in terms of events enabled by Saga pattern (for cross-domain consistency) - a viable alternative to transactions in most cases

Concentrate on principles - not implementation details. If you're building a microservice framework then you're probably wasting your time

These tools have already been commoditized, little reason to roll your own
Kubernetes, Kafka

Invest in studying

CAP tradeoffs
Domain modelling
Event storming
Event sourcing / CQRS
Sagas

What changed? Why abandon the traditional ways?

20-25+ years ago hardware was expensive, virtualization was just taking its first baby-steps and software ran on bare metal which had to be manually provisioned and managed
Servers were costly centralized resources

It made total sense to optimize for shared resource usage and build monoliths

This is the fundament that changed - shared resource optimization is no longer the driving factor, allowing us to instead manage and execute different components of the overall architecture in almost complete isolation from each other

Changing considerations at different scales

Have you noticed that it's very easy to get started with certain languages and tools but when the project starts growing past a certain limit it all turns into an unmaintainable mess? After which you see new issues which are addressed by a different set of tools that are harder to get into initially but avoid some or many of the scaling pitfalls.

For example it was really easy to start programming in BASIC but the lack of tooling and features for structuring the code well put a very real practical limit on its ability to scale. With C on the other hand it was harder to get started but see all the things built with it. Nowadays one of such tradeoffs might be between running single containers in the cloud with e.g. AWS Fargate or self-installed Docker versus using a Kubernetes cluster.

This basic pattern repeats itself everywhere in programming and tech - and not just in two layers. There are undoubtedly several layers (infinite even perhaps?) of this everywhere.

How does this affect you on a practical level? First of all I would start off by being very wary of "easy to get started, immediate results, no coding required" tools by default. It's certain that this is only a rule of thumb and won't apply anywhere (there certainly are some tools and techniques which are both easy to get started with and do scale well).

What makes this such a hard problem to tackle is that one very often needs to learn the lessons with the simpler tools first to understand why some of the scaling features are required and potentially worth the time and monetary investment. Of course you also have to consider that not every project will need to scale up to Google scale so immediately starting to design an architecture for that will cause cost overruns and lead to very unnecessary non-lean over-engineering.

At this point I can't really provide further advice beyond keeping this consideration in mind when making tech decisions - and staying away from advertised silver bullets.

This effect also has a relation to another phenomenon in IT - namely reinventing the wheel with new technologies. As previously dominant language or framework starts getting mature and likely gathers many more features, it often also starts getting more complicated to use. This complexity combined with a few perceived faults in the language core give rise to a few new languages / frameworks that start with the premise of fixing the glaring problems and making it really easy to get started. Interest starts building up and soon it's noticed that some of the essential features are missing (that in the previous language were part of the base or had been added during maturing but are not technologically sexy highlights - and perhaps often relate to the larger scale management). The fix to this is often re-implementing the same basic features that often originate from the 60s and 70s - and often implemented by fresh juniors with lots of skill and enthusiasm but little experience and knowledge (thinking they're inventing something brand new).

Eventually (assuming the language / framework survives) the missing features get implemented in a sensible fashion, the latest new things starts nearing maturity - and the same cycle then starts again. Nowadays this cycle is present nowhere as visibly as in the JavaScript ecosystem.

Journey of a Contemplative Architect