Thursday, April 26, 2012

SOA Discussion

SOA is a constant topic of conversation these days and has become a mindset for people trying to architect systems that are available, dependable, scalable and maintainable.  The trick of discussing SOA at a high level or simply stating that it is your goal is that SOA itself is not easily definable when placed in the paradigm of any individual business or system.  SOA is a conceptual methodology or a philosophy of sorts.  This seems to contribute to the possibility of having your project team actively pursuing a common goal, of which they each have their own interpretation.  As this becomes more of a daily task in reality and less of a mental exercise, it seemed like a good time to lay out what this architecture means to me.  In my mind, SOA can be defined as an architectural means to build a system consisting of many autonomous and focused services.  To architect in SOA you have to start by defining the level at which you want the system in question to be integrated.   Since any business system can be comprised of many smaller systems, on many platforms with many different implementations, it is paramount to focus on an integration interface.  As I have heard said, and love to repeat, "You have to eat your own dog food."  Basically, rather than creating a system and then opening up integration, the systems themselves have to be a packaged consumption of service integration. Rather than get into a semantic argument about the exact definition of a service, for the sake of this discussion, let’s assume that a service is defined as a program that is interacted with via message exchanges.   Services must be designed to be highly available and scalable.  When I speak of being scalable, it is to infer the need to have a consistent and constant means of interacting with the service itself.  The configurations and aggregations of the service are where you should scale out and customize to meet an individual name.  But at the end of the day, the core service should be small, focused and consistent.  It should perform the advertised action entirely and exclusively. This is the Single Responsibility aspect of the SOLID principles, which should be followed as a rule.  Being singularly focused allows for our services to come together to form a loosely-coupled infrastructure, which positions us for the aforementioned scalability and maintainability.  Systems architected with this philosophy result in a network of logical processes and functions that are no longer held prisoner by existing infrastructure or large project migrations.  This allows for individual services to be updated, replaced, rewritten or otherwise reinvented without a direct impact to any integrating processes.  As long as all service interfaces remain consistent and constantly rely upon a messaging system for interaction, the actual logic and heavy lifting of these services becomes completely abstracted behind the interface messages.  This, to me, is the heart of the matter.  Every service, every process and every system, at this point, should be interacting primarily with an eternal set of interface messages with a seeming disregard for the entity with which it is communicating.

The Four Principles of SOA
A generally accepted principle of SOA is the adherence to the four principles of SOA.  We will go through them one by one and try to get to the root of what they mean in spirit and in actuality.

Boundaries Are Explicit
A boundary is really just the dividing wall between a public interface and internal implementation. That concept is fairly straightforward and should be easily grasped.  It is vital that this is followed to allow for dynamic relocation of services, etc.  A good way to grasp this principle is to make sure that a consumer of a service has absolutely no insight into the implementation of the internal process.  The consuming entity should have no control of the performance of the services being consumed.  I would like to take this a little bit further and begin to talk about bidirectional crossing of tier boundaries, but in order to do so, we need to define the terms ‘layer’ and ‘tier’.  When we talk about these items, understand that a ‘layer’ is a logical grouping of code and/or processes.  For instance, a DAL or data access layer is a grouping of functions that interacts with a dataset of one kind or another.  The DAL should provide services/functions that encapsulate all interaction with the dataset to ensure all rules are consistently met and any policy enforcement can happen in a dependable manner.  This has absolutely nothing to do with physical location and arguably can share process space with other layers depending upon the facts of a given situation.  A ‘tier’ on the other hand implicitly indicates a physical location.  In other words, a tier contains one or more layers.  A good rule of thumb is that no process should interact with another process any further out (being toward the consumer) than the tier at which it is running.  This rule is not broken with push notifications, update messages or callbacks; those are different types of communication and are completely acceptable.  This rule ensures that your objects are designed with the appropriate level of encapsulation in mind. For example, consider the following 3 tiers and their logical format.  GUI, BLL and DAL.  The GUI will be the consumer of a service on the BLL, UpdateEmployee.  BLL::UpdateEmployee will in turn be a consumer of DAL::UpdateEmployee.  The argument comes that the Employee entity could have a related EmployeePicture record.  The quick and dirty reaction is to have DAL::UpdateEmployee update the EmployeePicture record directly against the dataset.  A later revised system requirement is that every Employee should have a 1:* relationship with EmployeePicture.  So, the developer creates a BLL::UpdateEmployeePicture along with DAL::UpdateEmployeePicture.  Problem solved.  A consumer can create Employees and EmployeePictures interchangeably.  Another revision requires that every Employee must have a default picture.  So, adhering to good OOP principles, the developer has DAL::UpdateEmployee consume BLL::UpdateEmployeePicture to create the initial record.  This is where we start getting into all kids of trouble.  The tier communication rule is broken. While this is innocuous enough right now, when the next design change requires the Employee record to be updated with a modified date of today each time an EmployeePicture is created, it becomes an issue.  So what do you do?  You can’t have BLL::UpdateEmployeePicture call DAL::UpdateEmployee since it could very likely be the parent origin of the current context.  You can’t have DAL::UpdateEmployeePicture call DAL::UpdateEmployee, since it is likely the origin of this context.  The model is broken.  The ideal methodology here is to have the DAL functions do nothing more than they say they do and have the BLL services be aggregate calls to the DAL.  All synchronous consumption flows inward and is never intermixed across tiers.

Services Are Autonomous
I like to read this one as ‘Do what you say and say what you do, completely.’ A service should perform the task it is designed to perform and never more or less.  The entirety of the process should be documented so that all service consumers are acutely aware of the process intent.  These services should be completely independent of the consuming code as well as any services with which they interact internally.  This allows for a service to be versioned and deployed independently of interacting processes.  This also creates the need for a rule to follow in regards to this principle.  Once and interface is deployed and live consumption is plausible, the interface can’t be modified.  Interfaces can be added and logic abstracted at this layer, but the interface, once published is in fact eternal.  You should also always assume a worst case scenario from the consumer of the service.  This means, don’t count on the data to be valid; don’t assume all values are correct; and don’t assume that the developer of the consuming process has any idea what your service is expecting.  This includes the dependency of functions being called in a certain order to create and maintain scope level variables.  All things that need to be known by a service should either be inherently internal or provided by the consumer.  Create your rules, enforce them internally and raise exceptions when they are not met.  This includes interaction with other services that may not be available.  Never assume that all interactions required will be available.  If the service interacts with something that is currently unavailable, you should have a fail-safe in place.
 
Services Share Schema and Contract, Not Class
Service consumption should be based on contracts, schemas or policies. A good way to stay agnostic in this regard is to communicate using SOAP or XML schema-based messages.  This makes both language and platform a non-factor in your service consumption.  While this is highly dependent upon the need to be available outside of known technologies and platforms, it is always a good mental exercise at design time.  A contract must remain stable over time.  Again, once deployed they must be eternal.  That is probably the most difficult part of this rule.  But, I always like to think of a very simple example for this sort of philosophy.  Consider a light bulb socket.  Light bulbs themselves have changed over time, as have wiring standards, breakers, building codes, etc.  But, a light bulb socket has maintained interoperability with all of these items by simply interacting with electricity on one side and a static interface to a light bulb on the other.  It has no visibility, nor concern with the inner workings of the light bulb, nor does it care about the circuitry of the room in which it is housed.  Its interface, or contract if you will, stays constant and simply works with any input/output that interacts with the known interface.  Always adhere to a good division of internal data and external data.

Service Compatibility Is Based Upon Policy
Honestly, this one is tricky.  I have researched this and the best personal example I can give is how you would use a WCF OData service and a QueryInterceptor to resolve a local security policy.  The intent is obviously to have external, possibly machine-readable policies, governing access and compatibility to certain configurations and aspects of your services.  I struggle with finding real-world examples.  I like using the QueryInterceptor as a good and simple illustration.  With this, you can check the context of the caller and decide whether or not a user in this context should be able to access the query that has been requested.  This seems highly-scalable and very powerful since it completely decouples the services implementation from the policy governing its access.  I think this one will become more useful over time as more and more services adhere to the first three aspects of this article.

Conclusion
Overall, it seems that SOA is a great target to have for architecting a system and should be kept as a standard at all levels.  It will serve to position your system for future growth while ensuring your sanity while trying to balance maintenance and advancing projects.