Searching documents with Oracle Text, part 1

As I have mentioned in passing, it is my firm belief that an essential part of GDPR compliance is documentation. If your documentation is sketchy, out-of-date, or vague (consisting, say, of emails and slideshows), how will you show that you have privacy by design, and that you enforce and verify this requirement at every stage in the development process?

Once you’ve created all these documents, how will you find what you’re looking for? Have you ever wasted time trying to find information among hundreds of files in a shared folder or Sharepoint? If so, you already know that these tools have limited ability to search inside of documents. Continue reading “Searching documents with Oracle Text, part 1”

Considering an end-to-end GDPR solution: Oracle

In this post I will discuss an Oracle presentation on how its product line provides the technical means for GDPR compliance (link). Although I am impressed with the presentation, my main reason for introducing it here is to be able to refer to it when discussing different stages of GDPR measures. Simply reading through this document will give you an idea of the range of technical measures necessary for good compliance at the database level.

In particular, I like the 3-page Appendix at the end of the document, which lists various GDPR articles and the Oracle feature that helps you to comply with each one. If you’re considering a packaged solution, your vendor should be able to present a similar mapping of GDPR requirements to product features. Continue reading “Considering an end-to-end GDPR solution: Oracle”

Sensitive data combinations

This post is my first attempt to tackle the thorny issue of data which is not core personally-identifiable information (PII) but which, in some combinations, is enough to identify an individual. I’ll call this type of data combination-PII (or combo-PII), and such a combination in a specific search a ‘profile’, for this purpose of this discussion.

Combo-PII is reference data that describes living persons

This type of data is usually called ‘reference’ data by database specialists. This is the background data that structures our picture of a person using categories, such as the city and country we live in, our age range, consumer choices (e.g., electricity provider), and similar data. Each of these values, taken by itself, is not enough to identify a person. Many such values taken together can, in some cases, either identify the data subject with certainty, or narrow the number of possibilities enough for subject to be guessed, or combined with other data to produce a match. Continue reading “Sensitive data combinations”

The Data Inventory, Part 1

Let’s get started on something concrete. One of the first things you’ll need to launch your privacy-compliance effort is an inventory of what data you are currently storing. This inventory will be at the core of your efforts, and will be the reference point for stakeholders. In this article I suggest a basic approach to get started using a single table. Future posts will add more tables to provide additional information, so that in the end we have a small schema for our inventory.  Continue reading “The Data Inventory, Part 1”