My post for today takes the now-forgotten subject of software-process disciplines, in this case the Capability Maturity Model (CMM) (link) and proposes it as a sustainable, repeatable, and justifiable route to privacy compliance. The CMM’s method is a sharp contrast to the most common current methods, which generally come under the label of ‘Agile’ (link) which I have found, more often than not, serves as a nice-sounding synonym for ad-hoc development.
To simplify, the idea behind Agile is that, if you get all the stakeholders in development physically together, the creative dynamics of the group will produce synergies such that good software will be produced. CMM holds that this approach is not as likely to produce good software as a process-laden, metrics-filled approach that insists on measuring things and gradually improving the team’s capabilities (‘C’ in CMM).
My purpose here is not to argue that CMM is better than Agile, or vice-versa; the best method for a particular job depends on all the circumstances: business, technical, and regulatory. Rather, it is to assert that some projects are better served by a planned and documented process than by an ad-hoc method. GDPR is one of the circumstances that favor a planned approach, for many reasons, such as:
- GDPR is a hard requirement. If it turns out during development that privacy is going to be difficult to implement, you cannot meet with the business and get the requirements changed (as envisaged by Agile)
- GDPR mandates “privacy by design”, which implies that you start not only with a design, but a design that shows how you will ensure data privacy. Once you have this design, the most straightforward way to implementation is to follow the design
- One may be tempted to read privacy-by-design as applying only to the design stage; I will argue that it means that the entire development process must proceed with privacy as an inflexible, non-negotiable requirement
- If we claim to implement privacy by design, we must presumably be able to demonstrate that this is what we are, in fact, doing. In other words, we need to justify our process to the authorities and to data subjects. We need a compliance history (link)
- Finally, as I will argue in future posts, escalating security- and data-breaches have underscored the high risks associated with software development, especially where personal data is to be processed
My purpose here is to argue that, just as we have development methods for software, we need methodical approaches to data privacy and security. I propose following the maturity approach of the CMM for personal data, a kind of privacy maturity model.
The risks are higher now
As data controllers and processors, we care (or at least should) about the privacy of data subjects, but we also need to protect ourselves. We may be doing everything the right way, but if we can’t demonstrate this fact, then we cannot convince others (especially after a breach or other problem).
The modern internet landscape is a moving target, with a constant, dizzying flow of new protocols, products, strategies, and fads. Every change creates new opportunities for attacks (or plain mistakes).
Anyone, no matter how careful, can fall victim to a serious problem; this uncertainty reflects risks, both known and unknown, that are beyond our control. Not only is the likelihood of occurrence of the risks unknown, but also their impact if they do occur.
What is within our control, however, is the process we choose to promote privacy by design, and the documentation we keep to show that we are doing what we claim to be doing. By following such a process, we assemble a track record of best practices that will be difficult to fault should there be an audit or a breach. We might even find that we have fewer garden-variety (that is, non-GDPR) problems with our projects.
Stage 1: Requirements
It is not very common, in my experience, to come across a project that has reasonably complete, up-to-date, written requirements. More often there is nothing more than some boxes and arrows in a slideshow. There are many possible reasons for this:
- it is easy (and tempting) to assume that everyone in the team already understands the business well enough to dispense with explicit requirements
- composing requirements is a tiresome, exacting endeavor, requiring lots of meetings to elicit buy-in from all the stakeholders (nobody has time to come to your meeting, and anyway all the meeting rooms are booked up for weeks)
- once you get the needed business experts together and start the process of writing things down, you find that there is disagreement on exactly how the business operates, especially in borderline or exceptional cases
- if your project is meant to replace a legacy application, it might seem sufficient just to tell the developers, “look at the old app and do everything that it does, but without the bugs and adding features, x,y, and z”. This approach is not likely to help when it comes to privacy
In this last case, however, one might be led to wonder why the old app that is being replaced cannot simply be modified? Is there no documentation about its design, its coding standard, its test cases? There may be a good reason (such as a vendor going out of business), but more often the system has been patched so much that everyone’s afraid to change anything.
A production application represents a major expense for an organization. An application that is abandoned merely due to the difficulty of maintaining it is a huge waste, and an indictment of the process used to create it. It is this all-too-common waste that the CMM seeks, among other things, to avoid.
The conceptual model
Once your requirements begin to take shape, you can start to make a model of what business ‘things’ are to be processed.
However you approach it, the point is to figure out what data you need to process and, if it’s personal data, why you need it and how you justify that need. Whatever personal data you can eliminate, consolidate, or justify at this stage will save a lot of time in the downstream stages.
Your conceptual model should give you:
- the data you’ll be processing, specifically the personal data, for your data inventory (link)
- your business reason(s) for processing this data, plus the legal justification, retention time, and so forth
- a first indication that consent is needed from the data subject (for example, if requirements include profiling data subjects for cross-selling)
- a tool for a common understanding and agreement among the business-expert stakeholders
- a tool for communicating this understanding to non-experts (modelers, architects, developers, data-protection officers) downstream
- evidence of your efforts if the authorities should ask for it
Keep it simple, involve everyone
I find CRC cards (link) (proposed, as it happens, by some of the inventors of Agile) to be a good way to start. You have one card per entity, with basic information about each one. The cards are separate, and so can be edited during a meeting or brainstorming. The only information I would add to the classic CRC card’s class, responsibility, and collaboration ensemble is the private data (if any) to be handled by that class, along with the how and why of its processing.
At this point the cards are free-floating, so that no order or relations are implied by the placement of the business items in relation to each other; that emerges from the items’ collaboration (the final ‘C’ in CRC) with each other. The usual arrangement, with boxes and lines already drawn, implies that the decisions are already made, which may lead some people not to voice their reservations.
Finally, and possibly most important, the cards are ‘friendly’; they don’t look like a technical artifact, so that non-technical business stakeholders feel comfortable questioning and editing them. This is crucial, as the point of the exercise is to elicit the essential concepts of the function to be automated from people who have a collective, but not completely congruent, understanding of how the function in question works.
Well begun is half done
A good conceptual model can lead to a good set of requirements, an agreed-upon, written understanding of the business processes to be automated.
Visible to all stakeholders, requirements constitute a valuable business asset, eliminating untold hours of re-work, misunderstanding, and fruitless argument. (I have often seen requirements still being hashed out well into the coding stage.) It is also valuable asset for on-boarding new team members, saving them days or weeks of trying to figure out what’s going on by asking others or by hunting for documents in project repositories.
Last but not least, when anyone inquires into your privacy by design, you’ll be able to demonstrate that it was a prominent concern from the drawing-board onward.