One of the things I find myself repeatedly doing is helping teams to structure and organise their knowledge content. And, as part of that, I often have to explain the approach and the reasons for it. So here it is.
My approach to structuring a knowledge base has two modes that interact with each other:
- Model-driven
- Data-driven
We usually start with model-driven, or even meta-model-driven. The meta-model or template is to try to structure the content by subject and type and then by other use cases. Some will object to use cases being put third not first. However, all three facets make the podium, vs lots of, other also-ran considerations, so none really loses out. Also, the development of a knowledge base is rather different to answering a specific need: Knowledge isn’t for anything in particular but has value in many situations, and so needs to be flexible, which means being structured to more generalised principles that are adaptable.
So the meta-model tells me to get a breakdown of the subject matter as the people in the know see it; and also a breakdown of the different kinds of knowledge content they deal with. Both may come out together when they explain it, since people often find it hard to distinguish until it is explained to them. You have to listen carefully for the kinds of knowledge since sometimes these are roles that different elements of knowledge play, e.g. problem, solution, approach etc. Then the use cases are the answer to “how else do you need to be able to find/view this content?”. For instance do they have tasks that need to find content by geography or period: Whatever they tell us tells us which additional tagging, structuring or description the content artefacts need aside from subject and type.
In parallel we can do a quick bit of data-driven work by looking at what their knowledge content looks like right now. Typically, in an unimproved situation, this will be a bit of a jumble. But looking at the data is a good habit and a good first step in all data work. Look at the data and you will see patterns and labels and structures and you’ll be able to anticipate intent. For example, should you see (and this is a real-life example and also utterly typical of many cases) a file structure in which probably different users have created sets of folders for years, geographical regions, campaigns, business units, products and so on – usually repeated in different areas of the overall hierarchy – then you know that these are ways they think about, access, store and want to work with the content. These should find their way into the model. From data, to model. A data-driven analysis gives us a cleaned-up model.
All of this should ideally be done quite quickly because, even though the meta-model and model are a great start, they can only take you so far – too much risks “analysis paralysis”. We should start working practically with real data by implementing the proto-model we’ve come up with in an open way that doesn’t restrict people but allows them to add to the structure and tagging as well.
So the next step is practical work and that means observing and analysing how users interact with (or don’t) the model-based structure – analysing this data. The reason we want to get practical as soon as we can is that, so long as the design remains theoretical, it remains idealised. People report their idealised best-selves and are poor at introspecting and predicting what they will usually do in real life. Clever people very quickly come up with all kinds of facets that the content should be classified by, because they are thinking ideally, or else in that frame of mind that finds it easy to make rules for others that they fail to see will apply to them as well (speed limits, anyone?). So it’s best to have a fairly ‘open’, practical trial.
As soon as we can see the practical behaviour we’re gathering data about what really turns out to be useful, or, at least, used. People may use slightly different terminology to describe their work to what they reported before, may add or ignore different categories and so on. After a while we have some real-life data that we can use to drive a refinement of the model, and this is the point where we might implement the vital few rules and restrictions that are really needed to get into the sweet spot of best return for the users for their effort involved in using the structure.
You too have probably reached the mythical 20% of effort that renders the 80% of benefit at this point as well, via a winding path of (meta-)model-driven and data-driven design. It’s not perfect, but this is not a business of perfection, but of utility.