Friday 14 April 2017

I actively consider what I think ought to go into a controller

G'day:
This is one of these ones where I state the obvious (probably), and expose myself as a fraud in that it's taken me this long to reach this conclusion. I can only take solace in that I'm clearly in "good" company. Where "good" == "a lot".

Yesterday I was going a code review, and came across a controller method which made one of my eyebrows raise as it was about 40 lines long. Now this isn't intrinsically bad in context. I have seen (and still see!) controller methods that are hundreds and hundreds of lines long. In the past I've seen controllers tipping the 1000-line mark. Those examples are clearly just wrong. And the developers who wrote them and/or continue to maintain them should be shot. I'm not even sure I'm over-stating it with that sentence. Hmmm... I probably am I guess. But they should be squarely and actively admonished, anyhow.

So in the scheme of things, 40 lines ain't bad. Especially when we try to keep our code wrapped within 80-100chars, so a few of the statements were multi-line. There was about say a dozen statements in it altogether. Not so bad.

Before I go on - and not only because the dev concerned will be reading this article (once I send them the link... pretty sure they don't know this blog exists) - I hasten to add that this code is not by any means bad. I'm probably gonna go "close enough" on the code review, once one bit of the code is refactored at least into a private method. I am not having a go at the dev concerned here. They are just my muse for the day. And I happen to think they're pretty good at their job.

There was one bit of the controller that stood out to me and I went "nuh-uh, that's gotta go". There was a foreach loop just before the response was returned which performed a calculation. That's business logic and it has no place in a controller. Fortunately we have an appropriate service class which we can lift that out and re-home in, and slap some tests on it, and our code is a bit better and a bit more maintainable.

(NB: yeah... no tests... we have a policy of not testing controller methods, but that's with the caveat there's no logic in them. We followed the "don't test controller methods" bit, but not the "provided there's no logic in them" bit here).

However once I spot a problem, I immediately start looking for more, so I assessed the rest of the code there. The controller did seem too fat. I wanted to know if all the logic did actually belong in the controller, and my instinct was: probably not.

At the same time as writing this article I'm chatting on Slack about it, and me mate Mingo offered the perennial answer to the question "should this go in the controller?"

Isn't the answer always "No"?
~ mjhagen 2017

(yes, he cited the attribution too, so I'm including that ;-)

It's pleasingly pithy, and almost ubiquitously agreed with; but in a casually dismissive way that is seldom heeded, I think. The problem is it's alluding to a mindset one should adopt as a guideline, but clearly the answer isn't "no", so I think people dismiss it. If one says "one should never do x", and it's clear that "never" is slight hyperbole, people will acknowledge the hyperbole and stretch it back the other way as far as they see fit to accommodate their current whim.

With this in mind (retroactively... Mingo just mentioned that now, and the code review was yesterday, and this article was drafted in my head last night), I stopped to think about a rule of thumb as to when something ought to go in a controller.

Broadly speaking I feel there are only three types of logic that belong in a controller:

  1. a call to validate / sanitise all incoming arguments so they're fit for purpose for fulfilling the request;
  2. one (preferably) or more (and increasingly less preferably the more there are) calls to get data - using those incoming arguments - to build the response;
  3. a call to assemble that data for the response;
  4. returning the response.

Yeah that's 4. I was not counting the return statement. An example of this might be:

class UserController {
    function handleGet(rawArgs){
        args = validationService.validate(rawArgs)
        
        user = userService.get(args.id)
        publishedContent = contentService.getForId(args.id, args.startDate, args.endDate)
        twitterContent = twitterService.getForId(args.id, args.startDate, args.endDate)
        facebookContent = facebookService.getForId(args.id, args.startDate, args.endDate)
        
        response = {
            user = user,
            publishedContent = publishedContent,
            socialContent = {
                twitter = twitterContent,
                facebook = facebookContent
            }
        }
        return new Response(response)
    }
}

I think that's OK, even though there's four service calls. All the logic is busying itself with what it is to fulfil that request. The controller simply dictates what data needs returning, and gets that data. there is no business logic in there. It's all controller logic.

A pseudo-code example where we're adding business logic into this might be:

// ...
publishedContent = contentService.getForId(args.id, args.startDate, args.endDate)
topFiveArticles = publishedContent.slice(5)

// ...

response = {
    // ...
    articles = topFiveArticles,
    // ...
}

Here the slice call - even though it's a native array method - is still business logic. If the controller method was getTopFiveArticles or something, I'd go back to considering it controller logic. So it's a fine line.

Back to the code i was reviewing, it was doing 1&2, then it was going off-piste with that foreach loop, then doing 3&4. Ignoring the foreach, I was still itchy, but wasn't seeing why. It seemed like it was roughly analogous to my first example above.

So what's the problem? Why was the code still making me furrow my brow?

Well there was just too much code. That set-off a warning bell. I wanted to reduce the number of calls to various services (or repositories as is the case here: data from slightly different sources), but initially I couldn't really justify it. Then I looked closer.

The code in question was more like this (I've got rid of the date args to focus more):

class ContentController {
    function handleGet(rawArgs){
        args = validationService.validate(rawArgs)
        
        user = userService.get(args.id)
        publishedContent = contentService.getForId(args.id)
        
        twitterContent = twitterService.getForId(user.twitterId)
        facebookContent = facebookService.getForId(user.facebookId)
        
        response = {
            articles = publishedContent,
            socialContent = {
                twitter = twitterContent,
                facebook = facebookContent
            }
        }
        return new Response(response)
    }
}


I hasten to add the code I was reviewing is nothing to do with users and content and social media stuff, this is just an example which is analogous but would make more sense to people not in the context of our business.

The difference is subtle. The difference is that the decision that the IDs for the social content are stored in the user object is business logic. It's not controller logic to "know" that's how that data is stored. the user object here is not part of the data that composes the response; it's part of some business logic that dictates how to get the data.

And this does indeed break that second rule:

  • a call to validate / sanitise all incoming arguments so they're fit for purpose for fulfilling the request;
  • [...] calls to get data - using those incoming arguments - to build the response;

The call to get the social media content is not using those arguments. It's using a value from a separate call:
user = userService.get(args.id)
// ...
twitterContent = twitterService.getForId(user.twitterId)
facebookContent = facebookService.getForId(user.facebookId)


Another tell tale here is that user is not even part of the response. It's just used to get those "foreign keys".

My rule of thumb in those four points above is still correct. But one needs to pay special attention to the second point. Any time one's inclined to fetch data "indirectly", the indirection is business logic. It might not be a loop or some conditional or switch block or something more obvious that manipulates the data before it's returned, but that indirection is still business logic. And does not belong a the controller.

So how should that controller look? Move the content acquisition out into a service that has the explicit purpose of getting a user's content - from wherever - via some key:

class ContentController {
    function handleGet(rawArgs){
        args = validationService.validate(rawArgs)
        
        content = contentService.getById(args.id)
        
        response = {
            articles = content.published,
            socialContent = {
                twitter = content.twitter,
                facebook = content.facebook
            }
        }
        return new Response(response)
    }
}


This follows the second point to the letter, and there's now no business logic in the controller. Also the controller legitimately does not really need unit testing now, but the logic in the ContentService can be. Also the controller method is down to a size that would not raise any eyebrows.

This is not an earth-shattering conclusion by any measure, but I'm quite pleased to fine-tune my position here, and hopefully it can help drive discussion about this in code reviews in future.

It'd be great to hear anyone else's thoughts on this?

--
Adam