Saturday, 18 January 2020

Dependency Injection - The Pros and Cons and the Way to Utilize It the Right Way

I'm quite opinionated on the topic of dependency injection but my views were mainly formed years ago. Some time ago I encountered an article questioning the usage pattern and that got me thinking of refreshing and clarifying my views - which then lead me to this absolutely excellent conversation on the topic in Stack Exchange: https://softwareengineering.stackexchange.com/questions/371722/criticism-and-disadvantages-of-dependency-injection

This is my take on how to get the most out of dependency injection.

NOTE before we start - by referring to DI (dependency injection) I'm referring to the pattern - not to any specific implementation. It's possible to do DI entirely manually although most often a library or framework is used where a centralized DI container manages the dependencies' lifecycles.

First of all - what is it?

  • Dependencies are injected from external context e.g. instead of creating instances of other classes they're passed in via the class constructor (or in some cases private attributes - more on that later)
  • Class focuses only on knowing its own implementation details and lets other components handle the lifecycle management for the dependencies instead of trying to do it myself

What are the benefits and how to use it?

  • Makes it easy (or at least possible with sensible effort) to unit test the class
    • This can be contrasted to the situation in many legacy projects where calls to static methods are made that further execute file or DB operations etc. and break if tried to execute without the fully set up context (without doing some pretty fancy test mocking)
  • Explicitly define the dependencies of the class (or other similar abstraction) making it potentially more readable
  • Facilitates good design - when used correctly promotes SRP (single responsibility principle)
  • One of advantages of DI is making the dependencies explicit and clear. There's no point in listing ILogger as dependency of nearly every class - it's just clutter

When is it relevant?

  • When invoking dependencies which have side effects (which for most traditional software projects written in C# / Java / etc. is most dependencies)
    • It's the side effects which need mocking in the unit tests and that also need to be much more explicitly managed
    • Meaning also that for code that is purely functional (does not have any side effects) using DI adds complexity without any payoff

Common problems with DI containers

  • Easily makes code less readable by making it unclear where the call actually will get routed
  • Sometimes creates "interesting" hard-to-find memory leaks when an automatically managed object's lifecycle has been slightly misclassified

Anti-patterns - How NOT to use it

  • Do NOT create extra superfluous interfaces just because DI pattern demands it
    • All good unit testing frameworks nowadays enable classes to be treated in a similar manner for mocking obviating the need for the interface
    • Superfluous interfaces just make the code less readable, harder to jump between points in the code and harder to understand the overall flow
  • Using property injection - pattern used by some DI frameworks e.g. Spring. Basically mandatory when dealing with EJBs
    • This means that private variables of the class get set by the DI meaning that the class can be in an inconsistent state
    • Unit testing essentially requires using the container in question making them more cumbersome, less explicit and understandable etc.
    • Contrasted to constructor injection where after constructor call finishing successfully you can be quite sure the class stays in a consistent state (exceptions being cases where a direct dependency gets deprecated causing need to re-create the consumer object as well - this can be addressed via better design)
    • And the worst: makes it nigh impossible to use immutable classes
  • Breaking the classes into too small pieces without a driving design reason for it - and registering all the little classes in the container
    • Easily ends up exposing APIs and classes that should actually be internal implementation details within a larger class encompassing a coherent sub-domain

So how to get the best use out of it?

  • KISS principle - keep it as simple as possible
  • Make everything as explicit as possible (while balancing it with the amount of boilerplate)
    • Stay away from aspect oriented programming - it might be somewhat decent with good enough IDE support but I have yet to see it (you'll just create headache for yourself by making the functionality of your program less explicit)
      • Mentioned because DI containers are often a way by which AOP is implemented
  • Constructor injection only - and assign to private final
  • Avoid any complex rules in DI setup - and avoid too much implicit logic ("it's magic")
  • Never create an interface unless there's an explicit need for it from other reasons (i.e. there actually are multiple implementations for it or it adds very concrete value in being more understandable that way)
    • And along with this - when you do need to use interfaces, use the most specific (i.e. avoid overtly generic ones like IDisposable unless that's explicitly required)
  • Don't DI purely functional calls (although this requires certainty on which calls are like this and which have even potential side effects)
  • Always aim for code you can understand just by looking at it - meaning that DI shouldn't cause surprises in how it injects a dependency. This is related to keeping the injection rules as simple as possible. Avoid spooky action at distance
  • Do not ban creating instances outside of the DI framework (there will be immutable value and other classes, data classes, etc. which don't fall under DI anyway)

Some random insightful notes

  • "If code has no side effects, DI is just useless complexity"
  • "Should we create superfluous interfaces to satisfy the demands of the pattern? Absolutely not"
  • Note that previously (working with monoliths) it was an issue that a large software component's all dependencies might easily get registered in the same central DI container causing leakage between sub-domain boundaries etc. With the rise of microservice architectures this is much less of a problem since a microservice itself is supposed to implement a specific sub-domain and be of the scale where it's very natural for (at least major) dependencies to be visible to each other within the implementation thus obviating the need to start considering segregation strategies for the DI container.

Saturday, 11 January 2020

Let's talk about Microservices

Microservices are a big topic nowadays and for a good reason. Being so talked about there are also a lot of misconceptions (and truly differing opinions) on it. I started writing my take on it but then ended up finding a conference talk that summarizes the current situation so well that this ended being pretty much another summary with some of my related commentary included. The referenced conference talk by Tomer Gabel on the subject titled "Microservices: A Retrospective" (https://www.youtube.com/watch?v=DLRfT44e8uQ).

First it's best to clarify what I refer to when using the word "microservice". Best go by Martin Fowler's definition (https://martinfowler.com/articles/microservices.html). There are often questions and misconceptions about the size of a service when it's a microservice. The architectural style does not mandate that the services absolutely need to be a certain size - it's more about the bounded context - the ideal structure so that each service has a clear and minimally overlapping role with other services in the architecture. This may mean that one service only has a few hundred lines of code while another might have ten thousand. This is tackled from another perspective below.

Why microservices

  • Developer velocity
    • Independent releases
    • Limited scope of work
    • Stronger decoupling
    • Requires well defined bounded context
  • Scalability
    • Independent scaling
    • Independent storage
    • Limited scope is easier to optimize
    • Requires well defined bounded context
  • Resilience
    • String error boundaries
    • Partial failures are possible
      • Though requires approach where upstream failures are prepared for in the system design though
      • Still a lot easier than in a monolithic application where there is high and nigh-unavoidable interdependence at multiple levels
  • Secondary benefits
    • Polyglot
    • Easier to test at least in isolation
  • Most important reason: enabler for organizational scaling
    • When successful organization grows there will be more developers, teams, products, visibility (small audience = partial failures may not even be noticed but very different with millions of customers), liability, responsibility. 
    • While growing products become interdependent and so do teams
      • Incurs synchronization cost
    • Four key metrics of high-performing organizations (https://www.thoughtworks.com/radar/techniques/four-key-metrics)
      • Lead time
      • Deployment frequency
      • MTTR
      • Change fail percentage
      • They're all negatively affected by synchronization
    • Example of synchronization cost is when an issue occurs and first you need to figure out who owns the issue and should start the troubleshooting
    • So a high level objective should be to minimize synchronization, maximize independence and microservices are a great fit for this

Lessons learned

  • Small is good - it's much more efficient when developers, designers, etc. can focus on a small section of the overall architecture and reason about it without the details of the rest bleeding through
  • Smaller interfaces
    • Easier to reason about
    • Easier to evolve
    • Hard to keep small though
    • Results in lower coupling which in turn supports the minimization of synchronization cost
  • What makes microservice micro
    • Not the amount of code but rather minimal API surface area
    • Also the reason why you should never share the data source between two services - SQL API is massive
  • Polyglot architecture is enabled by microservices
    • Great promise of microservices, enables multiple tech stacks
    • But incurs significant cost when number of different stacks in a company is too high by making it costly for people to switch teams, transfer service responsibility, 
    • Number of languages even in many top tech firms is rather limited:
      • Google 6
      • Facebook 9
      • Twitter 4
      • Amazon 3
      • Netlifx ~3
      • Spotify 4
      • If you have more than one or two tech stacks for five thousand engineers you're doing it wrong
    • There's a dichotomy between individualistic point of view for always selecting the best tool for the job in that situation and for keeping the overall architecture consistent
    • Even organizations that initially enable total freedom of individually choosing the best tools based on developer preference end up consolidating to a smaller number of different stacks later
    • Smaller number of stacks makes mobilization between teams easier, easier hiring
    • For any regular organization 2-3 tech stacks is what you should shoot for
  • Conway's Law is true - "Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations"
    • Means that you want to structure the teams in a way that supports the architecture you want and vice versa
    • Common danger signs:
      • One data store owner by multiple services
      • Single system or domain or responsibility owned by multiple teams
    • Single responsibility principle SRP - if you're constantly violating it
      • Reorganize your services or your teams
  • Operations matter
    • Developer velocity = ship fast and strong
    • No bottlenecks allowed:
      • No global release
        • Instead many small rapid iterations
      • No centralized ops
        • Distributed ownership, devops
      • No manual deployment
        • Full automation deployment
      • No static topology
        • Ephemeral, autoscaled, serverless
  • Automation is key - everything as code
    • Infrastructure as code
    • Automated deployment and CI/CD pipelines as code
    • Automated monitoring, metrics, alerts
    • Remediation i.e. automated rollbacks, A/B testing, canary releases, etc.
    • Provisioning
    • Automation empowers your teams
    • Tools are still developing but at least we know what to shoot for
  • All of this supports minimization of synchronization cost

Criticism

  • One often raised (and to a degree valid) criticism for microservices is that by making the services smaller we end up just moving some of the complexity from the code level to the architectural level
    • This is indeed a problem when the higher level does not provide tooling to address the complexity better
    • This is why I wouldn't really recommend full microservice architecture prior to going full-on Kubernetes since I see K8S being a huge jump forward in the tooling to manage this complexity (better than is often practical at code level for monoliths)

What we haven't learned

  • Insufficient knowledge of the "physics of distributed systems" e.g. CAP theorem, 
    • We're all building distributed systems and concurrency is key
    • Disregarding CAP means
      • You will end up with inconsistent data
      • You will not scale
      • You will lose data
  • Observability
    • Distributed systems are by nature
      • Disaggregate
      • Hard to reason about
      • We are still deficient in tooling and methodoloy
    • Tooling is getting better
      • Tracing (Jaeger, Zipkin)
      • Metrics (Grafana, Prometheus)
      • Log aggregation (ELK et al)
    • But that's not enough when you can't debug in isolation
    • Have to know
      • What to log
      • What to count
      • What to monitor
      • How to make sense of it all
      • There are no easy answers
      • E.g. we're still counting average response time
        • Why average response time is a bad metric was summarized nicely in a comment to the video: "Response time usually has a long tail distribution, and the mean/average value of that does not really tell too much. I can be high because there were some very long requests or it can be high because all requests started to take longer. Instead you can use percentiles which tells you what portion of requests are faster than their value. For instance, P50 (median) tells that every second request is faster than its value and every other second request is slower than its value. For samples with normal distribution median is the same as mean/average. Of course you can split the distribution at any arbitrary point (e.g. P75, P90, P99). The downside of percentiles that you require all the samples at one place to compute them, although there are algorithms (e.g. TDigest) that can give you good estimates in distributed environments."

Recap

  • We aim to
    • Minimize synchronization
    • Maximize independence
  • We're struggling with
    • Safety
    • Scalability
    • Observability
  • The goals are met and issues alleviated via event-driven architectures
    • Events make everything simpler (easier to build, easier to reason about, easier to test)
    • Modeling interactions with events
      • Enforces strong context boundaries
      • Lets you scale services independently
      • Lets you observe the system in motion
      • Increases system reliability
    • Persisting event streams (event sourcing)
      • Observatility and auditing built in
      • Lets you scale use cases independently (CQRS)
      • Precludes full consistency (good thing)
        • Outside of the narrow field of software engineering, virtually nothing in life is fully consistent, no business is fully consistent
        • To get availability we have to very often give up some consistency
      • E.g. bank account's balance as a series of all transactions in the history in it
    • Modeling workflows in terms of events enabled by Saga pattern (for cross-domain consistency) - a viable alternative to transactions in most cases
  • Concentrate on principles - not implementation details. If you're building a microservice framework then you're probably wasting your time
    • These tools have already been commoditized, little reason to roll your own
    • Kubernetes, Kafka
  • Invest in studying
    • CAP tradeoffs
    • Domain modelling
    • Event storming
    • Event sourcing / CQRS
    • Sagas

What changed? Why abandon the traditional ways?

  • 20-25+ years ago hardware was expensive, virtualization was just taking its first baby-steps and software ran on bare metal which had to be manually provisioned and managed
  • Servers were costly centralized resources
    • It made total sense to optimize for shared resource usage and build monoliths
  • This is the fundament that changed - shared resource optimization is no longer the driving factor, allowing us to instead manage and execute different components of the overall architecture in almost complete isolation from each other

Changing considerations at different scales

Have you noticed that it's very easy to get started with certain languages and tools but when the project starts growing past a certain limit it all turns into an unmaintainable mess? After which you see new issues which are addressed by a different set of tools that are harder to get into initially but avoid some or many of the scaling pitfalls.

For example it was really easy to start programming in BASIC but the lack of tooling and features for structuring the code well put a very real practical limit on its ability to scale. With C on the other hand it was harder to get started but see all the things built with it. Nowadays one of such tradeoffs might be between running single containers in the cloud with e.g. AWS Fargate or self-installed Docker versus using a Kubernetes cluster.

This basic pattern repeats itself everywhere in programming and tech - and not just in two layers. There are undoubtedly several layers (infinite even perhaps?) of this everywhere.

How does this affect you on a practical level? First of all I would start off by being very wary of "easy to get started, immediate results, no coding required" tools by default. It's certain that this is only a rule of thumb and won't apply anywhere (there certainly are some tools and techniques which are both easy to get started with and do scale well).

What makes this such a hard problem to tackle is that one very often needs to learn the lessons with the simpler tools first to understand why some of the scaling features are required and potentially worth the time and monetary investment. Of course you also have to consider that not every project will need to scale up to Google scale so immediately starting to design an architecture for that will cause cost overruns and lead to very unnecessary non-lean over-engineering.

At this point I can't really provide further advice beyond keeping this consideration in mind when making tech decisions - and staying away from advertised silver bullets.

This effect also has a relation to another phenomenon in IT - namely reinventing the wheel with new technologies. As previously dominant language or framework starts getting mature and likely gathers many more features, it often also starts getting more complicated to use. This complexity combined with a few perceived faults in the language core give rise to a few new languages / frameworks that start with the premise of fixing the glaring problems and making it really easy to get started. Interest starts building up and soon it's noticed that some of the essential features are missing (that in the previous language were part of the base or had been added during maturing but are not technologically sexy highlights - and perhaps often relate to the larger scale management). The fix to this is often re-implementing the same basic features that often originate from the 60s and 70s - and often implemented by fresh juniors with lots of skill and enthusiasm but little experience and knowledge (thinking they're inventing something brand new).

Eventually (assuming the language / framework survives) the missing features get implemented in a sensible fashion, the latest new things starts nearing maturity - and the same cycle then starts again. Nowadays this cycle is present nowhere as visibly as in the JavaScript ecosystem.

From Architecture to Game Development: A New Blog on Echoes of Myth

I’ve launched a new  Echoes of Myth Development Blog , documenting my journey into game development and sharing insights from my first comme...