söndag 16 november 2014

Is REST the best thing there is?

This blog post is about what's important when creating an API. There are a few things which is sane, but there's a diagram which is, to be kind, misleading.
SOAP RPC REST
Requires a SOAP library on the end of the client
(1)
Tightly coupled
(2)
No library support needed, typically used over HTTP
(3)
Not strongly supported by all languages
(4)
Can return back any format although usually tightly coupled to the RPC type (i.e. JSON-RPC)
(5)
Returns data without exposing methods
(6)
Exposes operations/method calls
(7)
Requires user to know procedure names
(8)
Supports any content-type (XML and JSON used primarily)
(9)
Larger packets of data, XML format required
(10)
Specific parameters and order
(11)
Single resource for multiple actions
(12)
All calls sent through POST
(13)
Requires a separate URI/resource for each action/method
(14)
Typically uses explicit HTTP action Verbs (CRUD)
(15)
Can be stateless or stateful
(16)
Typically utilizes just GET/POST
(17)
Documentation can be supplemented with hypermedia
(18)
WSDL - Web Service Definitions
(19)
Requires extensive documentation
(20)
Stateless
(21)
Most difficult for developers to use.
(22)
Stateless
(23)
More difficult for developers to use
(24)
Easy for developers to get started
(25)
As for item (1) this is entirely true, but the item (3), is just blatantly false, we tend to keep forgetting that you need a library to use http.
Item (2) is not true, you are tightly coupled with the implementation of the RPC library (which you are in both SOAP and HTTP/REST) so whats the difference?
(4) is valid however a weak point. SOAP is basically XML+XML parser+XML validator and http client.
(5) is weird, what data isn't coupled to it self?
(6) This is myth with REST about exposing methods. So it matters what methods I use to make a call? Consider following code
public class MyResource {
    public String GET(String parameter){
          // return something
    }
    public void POST(Map <String,String> data){
      // do something with data
    }
    public void DELETE(String id){
       // delete id
    }
    public void PUT(String id) {
       // update id
    }
    // The rest here... All pun intended :)
}
So whats the difference here? Does a method name matter that much? You still need the URL just as you would anything else. This also applies to items (7) and (8).
(9) I must admit is convenient with REST, however still achievable with other means but a lot uglier so this I guess is a valid point. However in a real API this is a moot point.
(10) is true but in certain environments this is actually more loser coupling than a REST call, since you don't have to know content of the message but only that its a SOAP envelope. The envelope creates a loose coupling for some broker system since they doesn't need to know the type of message or the endpoint. Another important things is that if you use HATEOAS could actually result in more data sent because you have to manage the resource from the client, whereas if you use a SOAP this is managed by the server.
(11) if this a problem, then I guess you have larger problems.
(12) This just doesn't matter. In any large application this will be loads of URLs and confusing by its own.
(13) True, but a technical detail.
(14) Same as (12), just doesn't matter.
(15) If you build services as CRUDs then you are creating unmaintainable services/APIs.
(16) True but moot point.
(17) Again moot point.
(18) Again, there's no CRUDs. Code which based on this is automatically unmaintainable.
(19) True, but in there's WADL for Rest too. This is a nice feature which RPC usually lacks, however using a interface for RPC is not bad either.
(20) Wrong. You need as much as you need for a REST service. If you don't then you don't know how to code. Sorry partner. Just because you use REST doesn't remove the need of context of what the resource do, supposed to be and what relations it has. If you build you application on this premise it will be buggy and error prone with time. It will also add unnecessary costs to maintain it.
(21) This is just irrelevant point.
(22) Well if you have good tooling, I'd say SOAP is the easiest, though I can agree it could get unnecessary complex. I also think that SOAP is trying to do too much which is not needed.
(23) Again irrelevant point.
(24) Again with good tooling this could be as hard/easy as using SOAP.
(25) This depends on the RPC implementation. However its easy to hide it with abstractions.

A protocol is NOT an API. REST is a deceptive technique. If you have a web application and you want some sort of database without anything in between REST could be a choice. However if you want to do something else, use other everything else but REST. REST doesn't create an abstraction as it's advertised to do because the mechanics of retrieving and getting data is now on both sides of the protocol boundary. This creates a mechanic coupling, stuff knows too much about other stuff. I bet that we'll se a lot of "legacy" problems in the future where rest apps are becoming a real problem. There is a lot of problems with the REST proclamation and of its benefits. My view on that is that REST is the answer on another problem which is born out of inherit problems with the inability to code correctly (and that differs on programming paradigm, language and framework).

My point is that, it cannot and should not matter how you retrieve data, just that you actually did and what that meant for that particular code. If the protocol implies mechanics on meaning you are mixing things which should not be mixed.

fredag 14 november 2014

So what is "decoupling"?

One favorite argument when arguing for new technology or when people is arguing over what design principles you should use is the term "decoupling". My personal belief about this word is that it's extremely abused and more than often just used to boost the credibility of your own idea. It's also used as something to give the technology some "magic" properties as in "if we use this everything will be a lot easier". This is especially true for the arguments for using technologies like REST or Microservices. These two definitely have their uses but too often they are used as some sort of silver bullet which solves everything you throw at them. And with these two technologies are very often accompanied with the term "decoupling" or more correctly loose coupling. Well how intriguing, so how do they do this?
Consider following code
public void process(String[] array){
   for(int i = 0 ; i < 3; i++) {
       // Do something with array[i]
   }
}

So we loop through an array which is 3 elements long. So what happens if we do
public void process(String[] array){
   for(int i = 0 ; i < 5; i++) {
       // Do something with array[i]
   }
}

We have now changed the loop to 5 iterations. We haven't changed the method's signature so we don't need to change the type or anything, in a REST service the URL haven't changed. It still looks like the method as in the previous example. However it will fail if the client only has provided an array which only contains three elements. So how does this relate with REST or Microservices? Well if this would be a REST method, the technique REST wouldn't do shit about it. It would fail miserably like any other protocol able to invoke a method. This change will require a change with the client, so its not particularly "decoupled". The same would be true for a Microservice, all of its clients has to change. So where's the decoupled stuff with it? In REST is it the idea of you could add parameters without changing an API with a JSON map? Whats the point of that, send in data the service doesn't use or receive data which is not used? Have a endpoint which swallows data which you then try to make sense of it? Yay now you just created a ball of mud service by bypassing all of the abstraction mechanics you have available because you are lazy and being unprofessional.

With a Microservice the argument is that just because you can deploy stuff separately will make them decoupled. How does that apply to the above change?

 There's no such thing as a free lunch when coding. The above code is extremely simple but the length of the array is extremely important for the overall functionality of that piece of code. Actually there loads of information you have to deal with just with dealing with a string array of 3 elements. Cutting corners here will create a code which is bound to be misunderstood (yes with or without test), its just a matter of time. The above code has no description or information on what type of strings which is allowed as parameters. String's per se is an abstraction of information. A standalone string doesn't mean anything, it has to be put in a context which is meaningful for that particular data type. Just because it's a string and it's "understood" by the java runtime as an typed object doesn't mean its something which is relevant.

In today's http world, where we don't have to deal with the underlaying binary behaviors because it's abstracted away from us by standards like UTF-8 and such its really easy to not understand that most of the things we are dealing with are just some sort of representation of something, but because we can read it, and interpret it (as humans) we forget that whenever we're sending stuff (changing context) the information has to checked again to actually make sure that the context hasn't changed. This is what we see in the above example. The context has changed. It's still the same type of data but the requirements has changed.

Yes the example is a simple one, but imagine when a system contains millions lines of code, and try to follow data through a system where the code is not describing it's expectations of data makes it extremely hard to reason about the outcome of that code. So think twice when using your damn maps, and just because you can give abstract properties to the relations of an object doesn't mean that will create better code. In fact it will create a lot worse code since the intentions of the data is not described. It's like you would convert everything into an the type object and typecast it just whenever you want the information.

måndag 10 november 2014

The meaning of pragmatic

On the net there are an abundance of advice how you should be doing this and that to actually be successful in delivering projects. You should be agile and develop agile systems but no idea on what agile code is (no it's not writing tests all over the place). You should adopt this or that work flow/process/mantra/technique/<insert the buzzword of the month here> and you should use <insert the programming language of the month here> and by using <insert the flavor of architecture of the month here>. Oh shut up and pick a language and use that. Swapping between languages for solving things is not a good thing. You don't see a lot of professors in several languages for a reason, its just too hard (of course there are a few which is good at it, but thats a few.). Problem is that it's very hard to define someone as really good at a language, how do you measure it?

And also that everyone is pragmatic by doing simple things in a simple way.
WTF...? If coding is simple and solving problems is simple why do we have a job and is a problem a problem if its simple? No wonder that things fails and code looks like shit. A newsflash for most people, if you code is simple like a CRUD, you just failed programming 101. There's no such thing as a free lunch, particular when writing code. There are frameworks which will make you life a bit easier but, most of the time they fall short. Also if you are coding you ALWAYS have to look at the big picture (yes TDD falls short here). If you don't you will end up with a piece of unmaintainable piece of crap code because it's content is just rubbish. It may look all "clean codeish" and dandy with tests and stuff but it will still be a piece of crap, and bug ridden.

Oh hell, I just wrote my own "advice" (read rant) on the net... So I'll give my opinion on what pragmatic means. Pragmatic is NOT: use small frameworks and simple code.
Pragmatic is: use frameworks that are known and write code which makes problem solving simple.

söndag 2 november 2014

EA is troublesome

This blog post is about 6 heresies you could do with EA. And in my opinion, this is why EA fails because its too slow to actually do something useful. In any system which EA is needed the details are hard to capture with EA and its tools are too abstract. The post also says that EA practitioners usually fall in love with their framework so much that it becomes framework produced model which is not modeling the system its supposed to model and they should develop something that "works" for them.

It would be nice to have some framework to reason about the system, problem is that how the functionality of the system is in the details of where the code is executed. Usually the models of a system is bound to the physical structure of the system and also that a process forms a module. This is just not simply true, but to know that you have to dig into the code where there are several details which makes out the functionality. There are things like language constructs, code constructs and lack of information which is crucial for the model, but that is only found in how the code is built. And unless we are able to figure that out each time we take a "snapshot" of the system, EA and the tooling will fall short.

For EA to work it has to be a lot more code centric, and it's tools needs to be able to understand code and what its built for, and unless that is not figured out EA will not to able to provide information which could be based on making decisions. I'd say because of this EA is responsible to actually make systems worse because you are not able to make decisions which is reflected by the code but the process instances, which is not good.

There are several other issues with EA related to the inflation of patterns, but thats another post.

Edit: A good view is this video. Simon Brown has a good things going on there.


tisdag 14 oktober 2014

Are you solving problems by solving it with new technology?

Talked recently with a old colleague of mine about the solutions we are working with. We both worked with legacy and green field projects together and now he's tearing several old Tomcat systems apart and creating smaller (but not Microservices) services. He's using Dropwizard and this is a particular cool platform and I was eager to hear about it. My current assignment is working with a couple decade old JEE applications which has been somewhat ported to JBoss (which for the record, has not been impressive as a platform). But what we had in common on our new jobs were the fact that we had to take care of loads of legacy code. Both systems are lifting a heavy load and serving millions of transactions each day, and one system is from a successful startup.

There were two conclusions we made:

The first is that it really doesn't matter what you try, your code will look like crap or at least the guy coming after you will think of it does. Even the best intentions of keeping it nice and neat, if not given time or its large enough, it will deteriorate into a ball of mud. Both applications has been "customer" driven and whenever something major has changed, the developers hasn't been given the time to do a proper rewrite. For different reasons even the rewrites is not optimal and probably is the result of the second conclusion.

The second conclusion is that, for some reason people tend to solve problems by throwing technologies at it, like a NoSQL database or some fancy framework instead of just using plain old code. Whenever they are trying to solve stuff with the "new cool tool" they might solve the immediate problem, but loads of others problems turns up. Part of it that the developers doesn't understand how to use it or its just too damn hard using it, or its not working as they thought it would be. Or they end up with different frameworks entangled, both not fully understood or adapted.

I think we all have been guilty of being the one throwing "that new cool tech" or pattern at a problem instead of actually solving it with the current choice of technology. Every time someone starts blabbering about patterns, and particular when I know they even haven't understood the implications or details of that "pattern", I can almost feel the deterioration of that particular code. Or when you incorporate that, on the paper, beautiful BPM or <insert architect's new favorite tech solution which he have studied but never used> which you really don't need and poorly understood, because the only thing you need is code. And this is also one key factors which adds to crap code, is the idea of buying some other thing which should be able to "fix" the mess your code is, instead of just writing proper code.

On the other hand, as a developer you automatically assume that the guy which touched this before you is a dumb redneck which fell asleep on the keyboard which he have happened to pass on his way to the toilet. As the Dilbert strip so eloquently pictures it, do we as a professional admit the last guy actually did a good job? I know I've been guilty of thinking in those terms, but lets face it, most people does these things to put food on their table, not building the most beautiful piece of code ever existed.

I think most of the time you just need to sit down and think hard on what you could do with your current solution can do for solving it and not being seduced into using some technology.

onsdag 17 september 2014

Why you SHOULD use layered architecture

This post argues against using layered architecture since according to the post you simply don't change implementation of your database or implementation of JPA.

Firstly, the term layering is a misused term, layered architecture should be regarded as layered on its own. There are several different layers, such as communication, framework, hardware each of these might contain several other layers of multiple layers of some notion, such as a OS hosting processes etc.

Now secondly, if layers matters, you are doing layers wrong (layering to hide implementation of frameworks in a application which is about e-commerce is silly). Layering is organizing context, a common misunderstanding is that layering is abstractions. It has a lot to do with abstractions and most of the times its get confused with it, but it really is about context relaying.

Context relaying is heavily bound to the fact that abstractions leak. Some sort of combined function will always leak the inherent functionality. If it were true that the function wouldn't leak, it has to be so general that it will accept anything and reply with precisely what you want, i.e the God Service.

Now computers are inherently stupid, they will do as exactly as told even if its wrong, as long you talk its language. Problem is that whenever you are producing code you mostly focus on the how not the what. As an example is the inherently problematic behavior when writing objects. You might name an object as Animal and when you read this it seems like a very sensible thing to name your object because it might have a place in the object community. However this is blatantly wrong because the notion of Animal doesn't mean anything unless you provide proper context which is described with code, either directly or indirectly. Try switch the name of that object to X4 and things starts to make less sense. In fact, try obfuscating the whole codebase and try to understand what is going on, and if you do you might be closer to a good layering structure.

Now a layer is where a bunch of objects are making sense regardless of other bunches of objects are doing, even if there seems to be some sort of dependency of them. The code has to be reflecting this, if more than one "bunch" of code is breaking layers you have done layering wrong. Object oriented programming and TDD is prone to produce bad layering because of the fact that you are focusing on the "unit". Both SOLID and the principle behind TDD encourages doing layering correctly, combine them with context relaying you are one step closer to doing layering correctly.

Another factor of the "mess of layers" is the misuse of interfaces. And this is inherited by the fact that  its simple to produce code interfaces but hard to create data interfaces and those are mostly overlooked. Such as when creating simple CRUDs.

The idea of managing code with organizational structures

Recently Mckinsey published an article about how to achieving success in large complex software projects and their conclusion is cross-functional team produce better results. They mention an example of an insurer which had problems and switched to cross-functional teams. This let the individual programmers started to "focus on delivering end-to-end functionality rather than just thinking about their own roles" and because of this it communication improved, resolving requirements and solved problems faster. As a result code defects fell by 45% and 20% faster time to market.

The report mentions "agile, lean and test-driven" and shorter iterations but not exact methodology, but they do confine the development process into a team per module, whatever a module is defined as. Which should be aligned with having a cross-functional team.

However there are no, as usual, no mention of how the code were written, language used, just that you have to manage people and it will auto magically be a success. I think this mentality is really weird and typical for this kind of research. Also the example seems to be a new product which is defined from scratch, which means that most of the "bumps" is avoided (a legacy system is not documented and/or requires changes, rebuilding stuff because they didn't thought of your project when building things). As a conclusion, how you write the code really doesn't matter for your projects success as long they are producing functionality which the customer wants, and present belief, continuously asking the customer seems the way to go.

The claim that structuring your team which maps to the produced "modules" is interesting. Conway's law dictates that any system is doomed to be a direct copy of your organization. This has been dubbed to the Mirror thesis and the result is quite interesting. According to this, and somewhat Mckinsey, structure in the organisation should be reflected in the system. Two organizations which has taken this to an extreme is Avanza and Netflix. Avanza has apparently taken the scrum idea into the organization and utilizes it all the way up to the company board. Netflix tried to reverse Conway's law by implementing microservices and structure their organization from which architecture they wanted.

So if this is true having an agile organisation should produce an agile system. This should also be reflected in the code since the code outlines the system. So the thought here is that, in any given code base one should be able to get which type of organizational structure were used when building the software. Now if this is mirrored accordingly to which modules are produced which is probably fine, however if the modules starts to blend (more and more interdependencies between the modules, business grows, initial structure was off etc) with each other, or worse, the organization or business changes, due to external factors.

Now things tend to break apart, sometimes because its not feasible to change the codebase because theres no time or its not just understood. It might be so, combining information in new ways creates new interesting business areas which breaks old module "barriers". Management has no idea that a decision completely wrecks a structure, or system is simple not able to deal with the modular idea. Either way there's no way of escaping Conway's law, even with all the good intentions in the beginning.

But using an agile organization or development process (which is part of the organization) the code should be reflecting this somehow. The only evidence I can see is that TDD could produce "test induced design damage" which happens when testability becomes the main goal, and is not a good thing. The result will and always will be a result of your organizations communication "routes" since that what systems are all about, automatization of whatever case you have, and any deviation from that will cause an mismatch in your total organization and will create tearing problems.