Welcome to a new episode of your favorite podcast. The topic of today is to capture and import the vast amounts of existing content in your organization, and how to integrate this into your repository.

The import is one of the four major steps of an architecture tool implementation that we discussed in Episode 3.

  • Why bringing in existing content?
  • Concerns: age of content, correctness, type of storage (project vs. Enterprise Baseline), integration into dev method/SDLC, document naming/versioning
  • Size of import: number of models and complexity drive effort
  • Content format: how to map the information to models, objects, attributes
  • Create catalog of to-be-imported content (> Notion)
  • Five phases of import: preparation, initial import, create new model, normalization, quality assurance
  • Preparation: catalog, database setup, filter/permission considerations, tool mapping (Visio), tracking and reporting
  • Initial import: detailed look at what to import; automatically or manually import, use catalog for tracking, issues when importing
  • Create new models: BPMN Collaboration example, merge between different databases
  • Normalization: where does content go into structure (subprocesses, catalog models, etc.), object consolidation, adapting models to “correct” layout/standards (such as object naming)
  • QA: modeler QA vs. central QA, semantic checks, macros, automated workflows; release and publish, communication

Additional information

  • Import process in an overview
Existing content import - overview
  • Recommended DB setup
import database, filter, and permissions
  • Notion (notion.so) as catalog and tracker tool for the import (product will be available soon)
Content import - tracker in Notion

Credits

Music by Jeremy Voltz, www.jeremyvoltzmusic.com

  • CP1 (Welcome)
  • This (Interlude 1)
  • FC Groove (Interlude 2)
  • Cowbell (Interlude 3)
  • South Wing (Outro)

Transcript

(The transcript is auto-generated and was slightly edited for clarity)

Roland: Hey J-M, how are you doing today?

J-M: Not too bad, Roland, and having lots of fun! Isn’t that what we want to do these days? Have fun over the summer, right before we get back to work in the fall.

Roland: Well, I guess when we publish this episode, everybody is already back to work.

J-M: The old time delay, but the idea is we are going to help inform, educate and ultimately bring people together around topics we really care about and particularly today you get a spicy one. I know that everyone’s got tons of business collateral just sitting around and you want to be able to use it in a harmonized, collected environment for all your stakeholders. How are you going to do that, Roland? You’ve got to bring it in.

Roland: Yeah, but in all seriousness, I think when we think about – I think it was two episodes ago when we’re talking about the implementation approach, and that was full of stuff like defining your governance and implementing the technical aspects of the tool and the activity that everybody forgets first, creating the strategy. The last point of that exercise was to bring existing content because you don’t want to start from scratch and given the age of your organization, that might be decades of content in various formats that you might want to bring into your architectural repository.

J-M: So this is our first point for today. Let’s talk about the theory of why you want to bring information in from existing sources. And Roland, you mentioned you might have decades worth of information. I like to think of it as kind of your storage room of stuff. You know everyone is a hoarder full of past projects and initiatives that had a lot of deliverables created around them, and there’s tons of value in that. But you have to be able to find it, bring it in, and then do something with it. So first and foremost, why are we even thinking about moving this content into our new repository? We’re building something new, we’re buying something new. Why bring our old crap in?

Roland: Well, I think the important thing here is it’s a mix of how much time do you have available to build up structures, and how much time do you have available and resources people you have available to bring in existing content. Because everybody is busy. So what you want is to make the effort for the individual as little as possible so that they can go and focus on their current job while also supporting you in bringing in that existing content. 

J-M: You don’t want him doing all this rework, particularly when you’ve already got it captured somewhere else. And you also want to bring some of the insights from previous projects you’ve paid a lot for, particularly with other knowledge experts. I’ve seen a lot of organizations that work with these big SIs or strategic consulting firms – you know, the Big Four. Anyone who’s coming in as a player to help advise the future of the organization. That’s a lot of money invested, and it doesn’t all have to go away. Sure, some of it might be out of date, but some of the best practices are going to still be relevant, important, and you want to capture and retain that value in the organization. Don’t let it walk out the door with a bunch of trash you’re throwing out already.

Roland: Yes, and on the other side you also want to have something like your Enterprise Baseline. You want to know what your “what runs is”, your as-is – because how do you identify improvement opportunities? How do you identify situations where you want to streamline your application landscape and all those other aspects that you might have as part of your objectives listed out?

J-M: I want to give people an understanding of what to expect, though this is really our first point for today. When you move content from a previous source to a new source, you’re not going to have everything coming over and everything is not going to be valuable. There are a few concerns we want to take a look at when evaluating our population of existing information that will help you understand what you’re going to be able to retain and what the import process isn’t going to create for you brand new. And the first thing I think about a lot is the age of content. I’ve worked with tons of clients who say we’ve got this fantastic knowledge repository, and I say OK, so let me take a look at literally the last change dates, and they’re 3/4/5 years out of date. I can’t imagine the organization hasn’t evolved since then, so well there’s a lot of content, its age means that it isn’t in the context of the current operational structure and man that breaks my heart. Because if someone had done the knowledge retention and knowledge upkeep, it would be current and we could reuse it. So don’t expect that all of that old stuff is going to suddenly make new sense for you in the new environment, right?

Roland: Yeah, maybe they just didn’t have a place of storage or it was stored as part of a project and that has a very specific aspect to fulfill. So the next thing you might want to have a look at is what is project related content versus overall enterprise applicable content.

J-M: Roland, I think I hear you say the words “deliverable syndrome” sometimes. Sort of the information created to satisfy the requirements of a deliverable on a project. But how much context does that really have in the larger enterprise? What do you see in terms of the ability to retain and get value of that after the fact?

Roland: A lot of noise, a lot of noise that I see here, and it’s mostly driven by the lack of a current approach so that everybody agrees on: this is what we’re going to create as part of our implementations. This is something that we create to enable overall analysis outside of projects.So one thing that I would look at, if I go back a couple of episodes, I would look at when you implement your tool is definitely define how all those artifacts going forward will fit into the overall delivery process. So your SDLC, your Scaled Agile and all these things that I mentioned earlier. What we will see now is, when you think about the fact that the ultimate objective is to create a repository or a table of content or a catalog of existing materials, that you have to do an assessment of what goes into that catalog. And just like you said, there might be (when you research things) there might be siloed content that was just used for a single purpose. Or there might be a situation where somebody needed to create a slide for whatever important presentation you had to do. Or people copied a file over and over and over again. Did little changes because they wanted to make a point and then you end up with five or ten versions of it and you’re just thinking about what’s the right one.

J-M: I remember seeing in one of my clients this document name Rev. A, document name Version 2, document name July 13th version. It was a nightmare, and this repeated content means you’ve got the same information in essence but just duplicated over a ton of different sources. And when you silo that content, no one sees that anyone else has the same stuff. And then ultimately, when you try to import it, you’ve got a huge number of very slight duplicates. That’s going to be a hard consolidation and resolution problem to tackle, and you’ll need the right people to tackle that. Particularly, subject matter experts to talk to that content, its applicability, and (I see this all the time and it breaks my heart): are these people still at the organization? Oftentimes no, so you’re guessing at what John Smith / Jane Doe meant when they put those words in that box on that date six years ago. And unless you’re still good friends with him on Facebook, you might not be able to get in touch and ask them a question. And jeez, if they’ve moved on, they’ve moved on with their life! They don’t remember what they did 6 years ago on a project.

Roland: Yes, the people are literally the glue that you need to create this. So there might be holes in your catalog, which is fine, but at this point in the game you should have a good understanding about the volume that you talk about. Have you discovered five processes? Seven application landscapes, three data models? Or are these five thousand, seven thousand and three thousand of the type? So I think the next exercise here is to scope the content. So you’ve created that big pile of stuff, now you have to dive into and sift through and sort content out. That also means you need to look at the context. What is the most important existing content that you need to bring in that aligns the best with the objective of the whole exercise of bringing in existing content?

Roland: And then last but not least, you want to look at the size of the import. So you have your big pile. You sorted it out in smaller piles by importance and then now you look at the smaller piles. And you say, oh, what does it take to bring them in? And one way I’ve seen this and the example here was a bank that I worked with a couple of years ago was to look at the complexity of the content. So if we stick for example with processes, what we did there was we simply took arbitrary numbers and assigned complexity to it. And we said OK: if the whole model is less than 20 objects, it has a low complexity. If it’s between 20 and 50, it’s medium. If it’s above 50, it’s high complexity. And we use that then to determine the estimate. How long does it take? Well, maybe the low complexity takes 2 hours, then medium takes 5 hours, the high complexity might take a day for the initial import, depending on what you see there. That also meant we had to agree on what we count as objects to do this? So for example, in some of the drawing tools (no names shall be mentioned here) you just put in a shape and you put information in there. What you need to know, and you’ve already done this as part of your technical governance, you have to define the end artifact. So say to stick with the process example you want to bring in existing content as BPMN models. You might have information in your source files that are annotations that are remarks that are maybe risks or whatever different content that is not part of your BPMN specification.So you don’t have an object for that. You need to define what you do with it. So do you just refer to that in the bank example? We did this because what they did is they asked us for an estimate. We applied the low medium complex rule to it, which gave me a number and it took them six months to sign the contract. So when we came back, they had created word documents in addition to those existing models that they had. That included way more information, so some of those were between 6 and 20 pages. Which was obviously a lot more than we estimated,

J-M: That’s a lot!

Roland: And we went back and said look, we agreed on objects according to the BPMN spec, because that was basically the guideline that we agreed upon, and they were really not happy. But they also didn’t want to go back and say, yep, estimate that additional effort that you have based on our 6 to 20 page documents to bring that information in.

J-M: And I would say that there are some examples of tool sets that you’re looking to import into that are incredibly restrictive. I know the BPMN standard has a lot of reach and a lot of applicability to a lot of organizations, but by default it doesn’t have a ton of variability in the symbology you can use beyond the standard that has been defined. And there are some tools I know, for instance Software AG’s ARIS, that have extended the BPMN standard to some more symbols you might find useful as part of the palette. Look for things like that. If you’re talking annotations. If you’re talking risks, KPIs, different types of auxiliary symbols to define things (we didn’t really go into this as much in our selecting architecture tool), but in this particular case, this information can only be retained if there’s somewhere for it to go, and so finding a tool with notation that allows for the expansion beyond what you see on the page in a BPMN model? Super super useful, particularly in the import context.

Roland: I think the more interesting question is where do you make the cut and what do you do with the content that you’ve cut? So what do you need to save? Does it go back to your example into the storage room, or as I like to call it, the junk kitchen drawer that you never open? Or do you just throw it away? So the interesting question is, what do you do once you’ve done that? Because maybe there’s parts in it that you don’t want to lose.

J-M: Right? And you can always tie whatever end state repository you have to other types of document management systems. We see that a ton for organizations who have a ton of like text content, things that wouldn’t really fit well in an architecture tool. They’ll keep them in like a shared document repository and they’ll keep them as a document and then attach them to the models and objects that they’re bringing in as part of the import process. And that really talks to the next point that I wanted to hit,and our last point for this section is talking about content formats. And what they imply or allow. So for instance, extremely large sets of unstructured data, pictures, things that can’t really be imported like a structured file like an Excel or .bpmn. Or things that can really translate into models and objects. That unstructured data isn’t going to come in as well as you think it is. If you’ve got a picture on a wall importing that into tools, and there are some out there that actually do have a picture-to-model feature, but they’re few and far between. I know that that can sometimes be difficult to reconcile after the fact. You have to do re-work.

Roland: Yeah, or what is the content that might go into a lower level like an attribute for example. Just think about descriptions. Think about processing times for steps. Think about frequencies. How often does this happen? Or if we go into the application case, number of users and technology and all those things that you might want to capture on the lower level.

J-M: I agree, but we’re looking to try and figure out: what does this allow us, right? So models or structured content that implies flow often will lead to the creation of process models if you do it right. Scripting that allows you to automatically generate a flow, or if it’s in a format like a Visio or a .BPMN or .Archimate – things that have interchange languages that will regenerate that model for you. And then how do you increase the fidelity with spreadsheets of data or databases you can hit against? That’s going to allow you to add those attribution categories that give you more context you’re going to need to make decisions on the ultimate content you’re bringing in.

Roland: So at the end of the day, I think all of those points that we just spoke about should lead you to a big repository. You should have a big table (and I do have a preference which tool you should use) but you should have a big table of content where you have things like what’s the name of the model, what’s the age of the model, who’s the responsible person, what’s the format that we’re looking at, and so on and so forth.And then in further columns you then might use the same table to sort things to put in the criteria of how important is it for us? What’s the effort that we see the low, medium, high complexity, and so on and so forth? So at the end of the first step, you should have this catalog of existing content.

J-M: And that’s going to be our question for you as part of our first segment. Think about your knowledge storage room or kitchen drawer as Roland likes to call it. What do you have in there? What do you know about that you need to keep? What do you absolutely need to harvest from it to continue your operations? Look deep inside, ponder that question and we’ll return in just a moment with our thoughts and the next section.

Musical Interlude: “This” by Jeremy Voltz

J-M: And welcome back folks. Hopefully you’ve taken a look deep inside that kitchen drawer of old content to find some real gems you’re going to bring into your new environment through the import of existing content. Now it’s going to take a few different steps to make this come to life and to make it valuable once it gets there. And I know Roland, you’ve developed a foolproof method of bringing this stuff in, and the five steps you’re going to need to think about when bringing that content in. Can you give us an overview?

Roland: So I don’t know if it’s foolproof, but it’s stuff that I’ve done a couple of times and it works. The five phases that we look at are: preparation, which is getting ready for the actual import. The second phase is the initial import, so when you talk to tool vendors, that’s what they demonstrate to you. But that’s not the end of the work. The third step is creation of new models and we will talk about this when we get to that point, what it might take based on notations. The 4th step that we look at, which is the huge effort, is the normalization of models. How do we integrate existing content into the structure? And then the fifth step is quality assurance. How do we make sure that what we just imported not only fits into our structure, but it also follows the standards that we follow?

J-M: OK, so if I’m hearing that correctly, I’m hearing preparation. I’m hearing importing your models. I’m hearing creating the new models you’re going to be using. I’m hearing looking and normalizing it, those models, and then I’m hearing quality assurance to control and manage the content that you’ve brought in. Is that right? So preparation, import, model, creation, normalization, and ultimately quality assurance? That seems like a good five step process. Why don’t you take us through those steps one at a time and in this section I think we’re going to talk about the first part of it, which is our preparation and our initial import. ’cause that’s going to be the configuration that goes into ensuring the success of your subsequent steps. So first talk to me, what do you do to prepare?

Roland: Yeah, so this is a one time step, ideally. Remember we already have our catalog, so that’s the first big step. But the first thing that I would look at is what is your database setup in your architecture tool that you want to have a look at. And I will put some graphics in the show notes so you don’t scribble while driving, but typically what you see is a three database setup for the regular content development. So you have a work database, and this is where the sausage is made. And then once a model has a certain ripeness, you move it into a review database and then you have your business people and your risk people and your QA people have a look at it and it might go in some circles there, but once everybody approved it goes into some production or master database which has the highly polished stuff. So looking at this, you don’t know what quality your existing content has. So the first recommendation that I would do is do the imports in a separate import database, because you will import a lot of noise that might be content that somebody thought it was useful, which might not be useful today.

J-M: I think this is particularly relevant when you have an existing repository setup and you’re looking to bring in new content. Bringing new stakeholders you never want to bring that new content into your existing environment in the production environment someone is looking for because you don’t know what you’re going to replace. You don’t know whether or not it’s in the right context, and ultimately you could be creating artificially creating new noise in an area that you’ve already worked to try and clean up.

Roland: And when you look deep into the farthest corner of your heart, we both know nobody will ever clean up that noise in the work database. It will just be there forever, but it will spoil your analysis because there might be relationships available from objects to objects and then you have some reporting on it and you just see weird results that you can’t explain.

J-M: Or like library objects that are no longer relevant. Oof, those kill me ’cause we were trying to establish an enterprise standard. And now you’re filling it with crap, that is not the standard. Don’t do that. Keep it in an environment that you can import from and then later on we can bring it into somewhere we can work together.

Roland: And the other reason for an import database is that you might have elevated permissions, or a different method filter in your import database then what you would see in a whatever work review massive database. So for example, to stick with the BPMN example that we just had, when you bring in a BPMN file from tool-whatever into our tool of choice. You might see they all get created as BPMN process diagrams. According to the spec. So now you mentioned it J-M, you might have defined for your standard that you want to enhance your BPMN diagrams with additional symbols like org swimlanes, overall swim lanes or whatever you added to it. That’s a different model type. So for the import database you need to allow that base BPMN model type in your import database so that users can actually see what they imported, but you definitely do not want to allow those model types in your work database.

J-M: Absolutely not.

Roland: So that’s the second point. One is avoiding the noise that you bring in and having something like a quote, unquote mesh filter that filters it out. And the second one is you might have the need for elevated permissions, or a different method filter so that you actually can do the import and can’t do the sorting out of content. The second thing when you ask for the preparation of what you want is you want to have your tracker ready. And I mentioned that J-M and I have a preferred tool for this. We like to use Notion for this purpose because it allows you to create databases where you can create different views and it allows you to use templates and whatever.

J-M: You’re pulling back the curtain on the secrets behind our own podcast.

Roland: Well, the point is, we also developed a tracker in that tool. Shameless pitch for this, but what you want is you want to have in that tool one place to go for your catalog, and then for each model in that catalog you want to apply the steps that you’ve defined (which we’re going to talk in the future) the steps that you’ve defined and being able to track it. Have I done the initial import? Yes/no? Have I done the merge from the import database to the work database, have I done this? Have I done that, and that is super hard to track in a less visible or visual tool and therefore we love Notion for that purpose.

J-M: Yeah, and remember you’re populating the first category. What am I looking to do? That’s a negotiation. You’re populating that based off of what’s in the drawer and what you think is going to be good and then tracking it through its statuses allows you to see and make sure that you’re retaining everything you’ve decided to retain, ’cause if it’s made it to the list you want to at least give it a shot, right? So being able to manage that project plan also gives you more control over the timelines that you’re looking at as part of data ingestion. So all these things are really important. It’s part of the project management scope, right?

Roland: And you can be damn sure that your stakeholder will ask you how far are we? You know you want to know who’s behind? Is there a certain person who’s behind? Is there a certain area that’s behind? You know, because you don’t get the SMEs and all those things, and obviously their expectations are that once you start it, they give you a day or two and then they think that you might have already imported 80% of all the content which is not true. That will not happen.

J-M: Well, there’s Roland making sure everyone gets their job done right on time.

Roland: Well, to give you an example, once upon a time I proposed for a large bank here in the United States to help them import existing content and they were talking about somewhere between six and a half and eight and a half thousand models that they had in different formats in different repositories and whatnot. And their timeline (and unfortunately we didn’t win), their timeline was: yeah, we get this done in a year. Because it affected every organizational unit of the bank. In all reality it took them like more than 15 to 18 months to do this. So then the question comes up: How relevant is that content after those 15 to 18 months? But let’s leave it like that. And they actually built a machine that cranked out dozens or hundreds of models a day through all the processes of approvals and reviews and all these things. So it’s not a trivial feature to do this.

J-M: Yeah, that makes a lot of sense, and that leads me to the next point that I wanted to cover in this section, and besides the preparation, let’s talk about our very first steps. ’cause the first thing we’re going to be doing is essentially dumping one system into another. We’re doing a raw import as we call it, which is going to bring content in as natively close to the format that we started with as is necessary. And we’re going to take that content and build it into knowledge repositories that allow us to see it as either objects in a library or as models with their structure and flow. And that’s going to be bringing in those source files, creating that stuff, getting us started.. Roland, I know you’ve done this at a very large scale. Tell me about this raw import process. Where does it go wrong? Where is it? Where is it really good for getting information in?

Roland: That’s one step that you have to do as part of your preparation to do this first, and it depends on the source file format. You need to look at your architecture tool and maybe configure it in a certain way. So for example, if your source format is a Visio file and we all know in Visio, it can draw whatever you want, even though they have stencils and there’s no method behind it. While you need to tell your architecture tool, what does that rounded rectangle mean? What does that circle mean?

J-M: Ah, the mapping.

Roland: Yes. What is the stuff that you should just not import? So that’s the first step from a technical perspective. When you bring that into and through the actual import. Well, you have to prepare: obviously your database as I just mentioned before, you have to do the model and object mapping. And then you will have to define what of that process should be automated. Right, to say “shall the merge from the import database to the work database be automated or shall that be done manually?” Both ways are possible.

J-M: Yeah, and eventually you’re going to end up hand touching a lot of these things anyways. I always use the 80/20 rule: the best you can hope for is that 80% will come over in an automated fashion, but really, we can’t expect that everything is going to perfectly translate, right?

Roland: Yeah, and it’s it’s different. If you have, say, say, 50 models to import versus the example that I just gave where they had literally thousands and thousands of models. So it’s also thinking about: does it make sense to have an automated workflow and create this, or is it done faster just by doing it manually?

J-M: Yeah, I see people talking about the scripting time. The scripting time required to build something that’s based on your configuration might be longer than just having your people go and do the work.

Roland: Yes, and that might also be a hard question to ask when you look at your content. And that brings us to the second step: the initial import. You have to have a look at your content and say: what do we actually bring in? And I typically like to call it the raw import, when I think about the structure of my database, one thing that I’d like to do is I’d like to create a folder for each model that I want to bring in. Independent of what type of model it is. And underneath that I create two folders: one is called raw and the other one is called ready to import and we will talk about that in a minute. But what I do then next is I gather the necessary information for the process. So what does it mean? I definitely want to have a graphic of the process (when you think about the model) when you think about a process you obviously want to know the sequence. Right, the second thing is you want to capture the information about who’s doing it? Who’s my contact person? And get additional material, so they might have, for example, a document that accompanies the process. Or they might have risk and compliance information. So I’m thinking about PDF’s, word documents, spreadsheets, or that type of stuff. The last thing is if it’s a technical format, I’d love to have the source file. You know how do I import a .BPMN or .Archimate file or Visio file if I don’t have the file? So four things: the source file, a graphic of the model, additional existing information that might exist, and the contact information and availability of the owner and the modeler in case questions come up.

J-M: And as you ask for that additional information, that can also lead you down the road of like what am I missing? Because the contact owners are going to tell you, hey, here is some other auxiliary information  you might also want to import. And you can go and iteratively say OK, so I missed this in my first cut of what I was going to bring in. But this actually fits into perhaps an import process as well. So now I’m going to bring those stakeholders and involve them, get the buy-in, so they’ll be more ready and willing to invite themselves to your new environment that you’re creating with their content.

Roland: Yes, and the last step that you shouldn’t forget is if you have a team that does the import. Once they import a model, put your name as the importer in your tracker so that you avoid duplication. Not that I’ve not seen that before. Alright, so then that the next step that you do is the actual import, and this is all dependent on the source format. So typically architecture tools have different importers, Visio, Excel, .BPMN, whatever it is and they work differently. One thing (to stick with a BPMN example), one thing you see, for example, if you get a .BPMN file is you see a quote unquote a little bit of noise when you unzip them. So you can open the BPMN file with your zip tool. So you see, for example, a glossary.bpmn, a resources.bpmn. These are all files that were created by the source tool, a runtime system. Wherever you get it from. Typically you don’t need them. What you see in that zip file, though, is then a folder or many folders that have the name of the process. In there, and that is the one that you want to have a look at, because that might include additional .BPMN files in itself. If you think about a process that has multiple subprocesses. So the one thing you need to do is to identify which of those .BPMN files is the master, because that later on will become a BPMN collaboration diagram while the other ones become BPMN process diagrams. A different model type.

J-M: And would you recommend that they import information from as granular and export as possible? As in, would you say hey don’t don’t export things in a hierarchical fashion? Export a bunch of just lowest-level files so I can import them all as a population of my lowest-level files. Or would you recommend having this hierarchical decomposition in your exports?

Roland: I think beggars can’t be choosers, so I would take whatever I can get, but you bring up a good point. I would put in the tracker a column that says, oh, this belongs here so that you have the area where you later when we get to the normalization would have a look at. Once you’ve sorted that out, you would actually use the importer, again Excel, Visio, BPMN, Archimate, DMN, whatever it is. You would use the importer and then do the initial import into that raw folder. And with that you might have some surprises. So for example, to stick with the BPMN example. Even though BPMN is a technical format and not just the notation visualization, it’s implemented differently by different vendors and you might see, depending on your architecture tool – I know the one that I use has a check of the BPMN file before it even imports something, that you might run into error messages or warnings. If it’s a warning, you get a pop up that says hey, something will be lost, will be not imported. If it’s an error, the tool just does not import it and that is when the panic kicks in. Like: Oh no, I can’t do this. Where I typically told my people just to breathe and have a look at the error message because typically you see, oh, there’s an error on line 262. And you go and you open your .BPMN file in your notepad. And then you Scroll down to line 262 and see what it is and most likely what you see is a tag that is source-tool-specific that is not part of the BPMN specification.

J-M: Yeah, and any good tool that you use that has the ability to import tools like .BPMN should have an error checker built into it. It should be able to in plain language tell you how to fix things otherwise you’re never going to be able to bring things in properly. You’re going to go back and have to try and keep guessing at what’s going wrong.

Roland: Agreed, and in the majority of those cases, when I saw the arrows, I’ve simply deleted that whole tag. You know that vendor-specific-tag from the source system and just tried it again and it worked. So when you think about the overall population, that’s by no means a hard rule or don’t quote me on this, what I’ve seen in the past is that about half of the process models that follow the .bpmn format are imported just normally. And then of the other half maybe 80% needed a little bit of schmoozing like cutting out a tag, you know, five lines from it, and it works. And then you had that remaining part of models that were just so stubborn because there were dependencies. Oh, you took out that tag and then something else popped up and something else popped up. And at that point, you might want to think about hey, is it worth it or do we just say, OK, this model now goes into “manual mode”, which means somebody literally fires up the architecture tool and just creates it by hand.

J-M: I’ve said before the 80/20 rule you can expect that 20% of your content is going to have to get redone anyways in some capacity, so don’t be scared about that. The other thing I want to talk about is I’ve seen things like lanes be in a different order when you import that in, and I know that that can happen, and that’s OK. You’re looking to represent this in a new way in a new tool. Sometimes the rendering isn’t exactly the same as it used to be, but the content is all the same. I know that reviewing that import and checking the visual formatting. Is really important because as you are bringing it into our higher fidelity tool, as you’re looking to move forward with this content, you want to make sure the layout the look of it is matching your new standard right?

Roland: Yeah, so good architecture tools do have something like a layout wizard that helps with that job. You know where you can apply that wizard and say push the objects by so many pixels to the side and upwards and whatnot so that it looks a little bit better. But this is where your graphic comes in, because what you want is you want to push the pixels in the model so that they resemble that graphic as well as they can. So what the tool should do is have that layout with it, should be able to apply default object appearance to say, yep, that circle is a certain size and not one is bigger than the other. And then you also might want to apply your template to the model so that if you have specifics defined for attributes that should show or a certain visual that you want to see, those will also be put in. So now at the end of this step you have imported the main model from your BPMN file and you have imported the subprocesses. They all might show up as the same model type as a BPMN process. And this is OK at this point in time, because now you’ve done what the tool is supposed to do. So you’ve done your initial import and you’ve brought in everything into the raw folder of your model in the import database.

J-M: Well, this is a great point to take a quick break and during the break we’d love for you to think a little bit more about where you’re going to be putting this content, what platform, what format? What structure are you looking to bring your existing content into, and how do you think that that’s going to provide value for you and for your organization will leave you for a couple of seconds and come back with our thoughts and the next three steps in our import process.

Musical Interlude: “FC Groove”, Jeremy Voltz

J-M: Welcome back folks. It’s been a great opportunity to think about your future states, but now we’re going to help you get there. And the next three pieces of our puzzle are creating new models, normalizing those models, and doing that quality assurance that makes sure you’ve got what you need and it’s looking right, feeling right and being right to make value for your organization. Roland: tell me a little bit more about our first one. What does it mean to create? New models from the content you’ve already brought in.

Roland: Yeah, so like I said a couple of minutes ago, the tools typically bring all the models in in a more generic format that might or might not meet your configured requirements. In most cases that I’ve seen on projects, it does not. What I’d like to do is, and we stick with the example of .BPMN models for this episode, I’d like to have a look at my folder structure and what I’ve seen under the name of my model. I see a raw folder and then a ready to import folder. So what I do in that second folder I create a new BPMN collaboration diagram because that’s what I want to have my main content in and I simply copy. I open both models, the stuff that was just imported and my brand spanking new collaboration diagram and I just copy the content into that collaboration diagram. This is where the different method filter configurations kick in because in your pure BPMN you might have a certain amount of objects that might not exist in your specific collaboration diagram diagram. So in ARIS, those are called Enterprise BPMN Collaboration Diagrams. So a different model type. And you would apply in your method filter that for the Enterprise type models you have just the subset that you would also use in the work database.

J-M: Yeah, can you do this in an automated fashion? Is there a way to script the copying over or the changing of model types? I know that that might speed up things if people are thinking they’ve got large subsets of content.

Roland: In a goods architecture tool, of course you can script stuff. It’s just a question, is it worth it or is going to one model, pressing control-a, pressing control-c, going back into the other model and pressing control-v. Is that the faster way? What you do now here is two things. You create your enterprise BPMN collaboration diagram. For the other models you create Enterprise BPMN Process Models. And then you will obviously have to connect those two or what we call to create an assignment. So say in your collaboration, you have tasks and sub processes. For each subprocess you should have a BPMN process model. So you need to create that assignment. The other thing you might want to do is you might have auxiliary information like which applications are used, which risks might occur on a certain task that might be part of those word documents, or PDFs that you’ve seen and collected in the first step. Well, you might want to create relationships to those. And that could be either in a separate model type, or it could be a relationship that you can create in the properties, depending on your tool. But you have then at that point in the game a model that is in the correct model types. You have a model that looks and feels as you would expect it in the work database. And you have a model that includes all the auxiliary information like risk and controls and apps and whatnot that you found in the source documents. So basically you’ve done your not only your initial import, but you created those new models and now they’re ready for the big move into your work database.

J-M: And I want to talk about something briefly, because I actually had a conversation with a client literally today about this. I would hesitate, if you think about putting a lot of additional information. I would hesitate to use it on an attribute level, particularly for information that’s connected to your processes and process steps. I had a client today who wanted to put a ton of additional attribution, including things like personas onto the objects themselves as an attribute value, a selectable attribute value. And the struggle that I was trying to represent is that these attribute values are actually used across lots of different models and on lots of different steps, and they really should be an object, not an attribute. Think about it like this, if you’re going to try and later on report out on the impact of the reach and connection of a certain property, in this case an example of a persona, and the customer journey set. Are you going to look at the persona and say: “what journeys affect this persona?” It is going to be infinitely easier for you if that is an object related to the models or related to other objects, because you’re able to much more quickly query around it, and you’re going to be able to make sure that if it changes then you can quickly propagate that change, and if it suddenly relates to new things, you can simply make those connections by drawing lines rather than having to go and fill in fields. So when you bring that stuff in, remember, you’re going to try and populate a library full of this reusable content in the form of objects, so you can make those relationships and that’s going to drive better adoption because more people are going to see their stuff connected in. And it’s going to help with reporting and analysis and visualization as we go down the line. Does that make sense Roland?

Roland: It does and that is one part that you should have agreed upon in the beginning. You know, what gets mapped to what.In your scenario it also might lead to the situation where you have conflicting information,because for example you put the role into an attribute of our task. That’s all great. They might work for one model, but then when you brought it in, you have a catalog of roles where there might be a team that harmonizes that catalog of roles. So now they change things and the words change. So what is your report that you run? What is the true information? Is it what is put in an attribute of that task? Who does that task? Or is it the relationship to an object that might have been consolidated in itself? My preference, for obvious reasons, is the latter.

J-M: Are we supposed to be getting better at this because we’ve changed to a more high quality, more high fidelity tool? Like? Let’s not let’s not fall back into old patterns, right?

Roland: Yep, I’m with you.

J-M: And I want to take this to the next point for us because I think this is a really important move forward. You know, we’ve moved all the things into our work database. We’ve merged them into that, tell me about the normalization of models because we want to bring them into the structure we’re looking to use for the organization.

Roland: Yeah, just a little reminder, depending on what you told us. If you do the merge, what I like to do in the tool that I’m using is to move the ready-to-import models into a first-level import folder with all its objects. Because when you merge them, this level one import folder into your work database, it does not create sub folders and sub folders and sub folders of information, but everything is contained in that one folder so you do not introduce new noise to your work database, because as I said before, nobody will clean up that mess.

J-M: It’s going to be you, oh.

Roland: So once you’ve brought it into your work database, you’re basically done with your import, as vendors might tell you, but this is where the fun begins. So the first thing you want to do is you want to figure out as part of the normalization, where do my models go? Right, so is that model, whatever, verify income in a loan scenario? Where does it go? What’s the folder here? Then you look at the subprocesses and need to see: hey, do I have that already in my database? Right, so that it’s redundant what you just imported? Or is that actually not a subprocess, but it’s an interface to another high level process that you have. So you’re modeling, quote unquote, your source file, might have been incorrect from a content perspective. So the idea is to figure out: OK. How does that new thing plus it’s sub things fit into the overall structure? And the second thing you want to do is you want to (to your example), if you create role objects and app objects and risk objects and all these things, you want to maintain catalogs. So think about, and we spoke about that in the implementation episode, think about building interfaces to other systems. Say you have a CMDB, if that gives you an application catalog. Well now you’ve created as part of your import an application object that you have in your repository and to stick with an easy example you might have agreed upon as part of your CMDB that all office files are just called Office 365 and you don’t care if it’s Excel, Word, PowerPoint or whatnot. But now in your imported model. Somebody says, Oh yeah, you use Word for this or Outlook, right? So that is a conflict that has to be resolved.

J-M: And when you consolidate objects together, you’re going to create that relationship between this new content, so the objects that are coming in on those models, and your existing library of content. And that’s a really important part of things you don’t want to just bring flat files into it and populate a whole bunch of new definitions. You instead want to stitch them and weave them together with your existing enterprise repository, right?

Roland: And the challenge by doing this is every single “redundancy”, or every single instance of consolidation is a conversation between people. So the tool will give you for example all the objects that have the same name of a certain type. And what you typically will see to stick with that BPMN example, you will see a gazillion of events that are labeled start or end for that matter. So what you want to do is you want to have a look at each instance and say is that really start and end or does it have to be a different name because it’s part of an upstream to downstream process handover. So the tools support this, typically by giving you the analysis, how many objects do I have with an identical name of a certain type, but at the end of the day you need to talk to people. And this is why it’s important in the first step to have identified not only the modelers but also the owners of those processes or those models because they need to make a decision.

J-M: Yeah, they’re the ones who are going to give you the information that you need, and they can sign off on the decisions that you make, right?

Roland: Agreed, agreed.

J-M: And that’s really important, particularly when you’re talking about normalizing these models to the conventions you’re ending up using, right? You’re taking these models that someone has poured love and time and money into, and you’re changing some components of their format to match your new enterprise standard So you want to make sure you are involving these stakeholders in those conversations and getting their sign off and buy-in as you’re seeing this new format come to life, right?

Roland: Yes, So what you see is when you talk to modelers, the people who created the source file, you have one set of conversations that will circle around: well did the tool import that model correctly? Does it look like the graphic that I as these the creator of the source file have created. So you might run into situations where, for example, one tool allows you to do things that are not compliant to the spec. For example, subprocesses that stretch multiple swim lanes, which doesn’t make sense according to the spec. So that’s the first conversation that you have. The conversation as part of the normalization might be with the same person or the owner, because what you want to do is you want to adjust your model according to your conventions. So your conventions for example might prescribe that the happy path must be clearly identifiable. So you push pixels around, changing the visual layout to accommodate this, which might result in the situation that it doesn’t look at all like your imported model. You also might go in this step and look at what you’ve agreed upon on naming conventions. So for example, one thing that I like to do is to say: OK in a lower level model task must be named in a verb-noun combination: “Create Invoice”, “Pay This” whatever it is. You also might have a convention in place that does the same for events which are typically like to flip around. It’s a noun-verb combination now.

J-M: But in the past tense, like “Invoice Created”.

Roland: Yeah, “Invoice Created” – something has happened. 

J-M: But this might be different than the way it is in your source tool, because you wouldn’t necessarily have those same rules and for stairs, so there might be a bit of a translation effort in order to make sure it’s the right language for your final destination.

Roland: Right, same thing in BPMN. Look at condition expressions. Put them on the connections that you see. When you create your swim lanes, reuse pools and roles, and application lanes with catalogue items.

J-M: Yeah, this means we’re starting to talk about the very first level of QA, the modeler QA, where we’re talking about compliance to what I would consider semantic,like a semantic of a model. Do things, work in a way that they are supposed to. Does the flow work in a way that it’s supposed to, and if so, then we’re ready to go, and if not, we need to restructure the model. We need to move, not just push pixels, but also move connections, move lanes, be able to build it in the right way to match what we actually want so that it looks the way we want.

Roland: Yes, and this brings us obviously to the last phase and just to put everything back into perspective, we had the raw import and the initial layout of the model and all those things being done in the import database. Everything we spoke about until now happened in the work database, including the modeller QA, which I highly recommend to create a checklist for. I highly recommend to create what J-M mentioned, semantic checks, so little scripts that see if it was modeled correctly. And I highly recommend to create little macros that will be triggered when somebody saves a model that helps you maintain meta information like who’s the contact person? What’s the description of that model? What’s the approval status, and whatever you defined as part of your release process. Once the process modeler has done that, QA, hopefully supported by technology, then we get to the moment where the review starts. And, as we mentioned a couple of times, that typical scenario where you want to have an automated process that helps you guide through this and routes the approvals because you don’t want to be in the business of shepherding calendars and availability of your reviewers. When you continue through this, you might have to decide whether there should be a more formal QA at the end of the process. Should it be in parallel to the approvals? My suggestion is do that at the end, so have your business and your risk people approve the content because typically they don’t care about formalities as you might do, and then have the last step have a dedicated QA team do what I like to call a “technical QA”, which is a more in depth review of the model so that it complies to the standards.

J-M: Yeah, I wanted to talk briefly about the business QA side ’cause I think this is a really important point that gets missed a lot even in earlier phases of this import process. And that is talking about the business relevance of this content. I mean you can create (and I’m sure you’ve done this before Roland) a whole catalog of crap that you’re going to bring in, and it looks good. It may even be in a nice format. Maybe it’s been properly maintained. Maybe it’s even somewhat current. But it’s not necessarily right for the business, and that’s something that can get very easily lost in the technical import of stuff. Perfect example for you is stuff we had built as part of a deliverable for a project for a To-Be state we never ended up using. That happens all the time. That stuff’s all over the place. There’s a billion “what if” scenarios captured and if it was a To-Be marked as a To-Be and someone assumes it’s an As-Is. Well, now you could be bringing in stuff that’s not actually real, and misinforming a larger community and that’s going to lower the confidence and trust in your environment quite substantially. So once again, you want to try and bring that earlier, but I know it doesn’t always happen, so making sure you have business owners who know what the business is actually doing, who can talk to the projects that have been going on, what they’ve achieved, and whether or not they’re actually got implemented, and ultimately be able to make decisions on the content to make sure that it’s right to bring in: those people need to be at the table, particularly in this last days of quality assurance.

Roland: Wholeheartedly agree. Remember, everything happens in the review database. And the next step that you would have to do is to formally publish your models. Ideally that’s supported by a governance process or workflow that’s automated, but the minimum things that you want to do are a couple of things. One is you want to version your models so that you can do comparisons of what has changed. The second thing you want to do is you want to create release notes. If it’s a brand new model, it’s obviously easy, right? You just write a description, but if it’s an update to an existing one, well you might want to say what has changed, the third one that you want to do is you want to push the model into the published database. Your master database, where then everybody and their grandmother with a viewer license can see it, but you also want to go back and delete it from the review database or push it back into your work database so that the later point in time you have the same status of the model in your work database as you have it in the published database. And then you delete it from your review database, because that’s just a temporary container.

J-M: And I would add on to this another component of this would be notifying people. You want to make sure that you’ve got a larger community of collaborators who now know that this has been imported. And they can come see it. They can comment on it, and once again that’s going to bring in those business stakeholders a lot more easily, particularly like the wisdom of the crowd. Even your end-executors, who can go back and say: hold on one second. This is not how we actually do it. Wait a second. You brought this in. Can I give you some feedback? And that’s fantastic. That leads into the ongoing operations and maintenance of a knowledge repository isn’t what we’re trying to do.

Roland: Yes.

J-M: Unless there’s anything else, Roland, I do want to get us another little break for people to think about what they’ve heard so far. I mean, you’ve heard about all the different five steps that are required to bring information in. Starting with that preparation. Initial import the creation of new models and normalization of models and important quality assurance steps at the end of the road. Be practical with yourself. What do you think you can actually import? What is actually worth importing for you, given the effort that’s required to ensure this content is actually correct and relevant for you? And what do you hope to get from that imported content once it’s in your system of choice, we’re going to leave you a few seconds. Have a thought. We’ll be back with our conclusions and the end of the episode.

Musical Interlude: “Cowbell”, Jeremy Voltz

Roland: Welcome back and I hope you had a good thought about what you want to achieve from your newly existing imported content. Which brings me to the main question, J-M, what does our ideal end-state look like? What have you seen with clients in the past that would describe that import Nirvana?

J-M: Yeah, the ideal end state for me is three things. Number one is a structured repository that I can access. Number two is that the content is correct in the correct format so people can easily read it. And number three is that the content is right for my business. Those three things put together mean that I can: A, retain, B, access, and C, use what I’ve brought back in. And I’ve seen a lot of clients really struggle particularly on the third one because they didn’t think about it early enough. They didn’t prepare by taking stock of if it was really useful for the company, so I ended up getting bogged down in Visio files, hundreds of models, or even sometimes 1000+ shapes on a single model, and we ended up going through all this effort and at the end of the day they couldn’t even use what they were seeing because it wasn’t relevant for the business. So I want to get to the opposite. I want to get that Nirvana of good stuff I can see and use in my daily life. And Roland, and I know you’ve helped a lot of organizations bring content in and get value from it, and we’ve seen tons and tons of war stories of where things have succeeded, where they failed, and I know we can’t address it all in the podcast. So thank you all for listening for this last hour of exploration, but of course, whatsyourbaseline.com has companion articles and lots of information about how you can put this into practice, where we’ve seen it, where we’ve succeeded and failed, and what it means to you. So thank you so much folks, and I think, Roland, well, this is a good episode. What do you say?

Roland: It’s a good episode after I have my shameless plug for the tracker that we’re gonna offer as a product on our website. So if you’re interested in getting a proven process being mapped out in our favorite tool Notion for this purpose, then just have a look at our website and I will link in the show notes to it, but otherwise it was a very, very good episode J-M. Also from my perspective, thanks for listening. Remember, we’re very, very eager to get your feedback, so the best way to contact us is to send us an email to hello@whatsyourbaseline.com or leave us a voice message by clicking on the link in the show notes. Also, we would greatly appreciate it if you would leave us a rating and review in your pod catcher of choice so that we can improve.

J-M: And lastly, you can find the show notes for today’s episode at whatsyourbaseline.com/episode7. And once again, I’ve been J-M Erlendson.

Roland: And I’m Roland Woldt.

J-M: And we’ll see you in the next one.