[A] Podcast
audio controls 10 10
audio controls 10 10

Pathways to Structured Authoring: DITA and Beyond

[A] Podcast #14

backforward audio controls 10 10
backforward audio controls 10 10


Interview With Patrick Bosek

Patrick Bosek, CEO and founder of easyDITA, discusses the future of structured authoring in this engaging podcast.



Patrick is a co-founder of Jorsek LLC, makers of easyDITA at easydita.com. Since 2005, Patrick has been working on a wide range of projects all focused on improving authoring, production, and distribution of content. Patrick advances the product documentation industry holistically and empowers his users.
Structured authoring as a whole is going to be a combination of formats and technologies all working together.


Hello and welcome to Towards a Smarter World. This is Cruce Saunders, your host. I'm joined today by Patrick Bosek. Patrick is a co-founder of Jorsek LLC, makers of easyDITA at easydita.com. Since 2005, Patrick has been working on a wide range of projects all focused on improving authoring, production, and distribution of content. Patrick advances the product documentation industry holistically and empowers his users. We're glad to have Patrick on as someone who develops, productizes, and solves problems with product content software. He’s a developer, thoughtful manager, and passionate customer advocate. Welcome, Patrick!
Thanks, Cruce. Glad to be here.

Let's jump into structured authoring. What do you see as the future of structured authoring?
So, I think structured authoring and probably structured content that's generated as a part of that, and the larger conversation, I really do view that as being the future of content in general, or at least content that's worth maintaining. Any kind of content that's going to be put out over different devices, any content that's more knowledge-based rather than one-off style communication. I think that's all going to move to structured content in some variety or another. As it relates specifically to structured authoring. One of the things that I see happening today that I'm maybe not predicting, but hoping changes a little bit, is that the conversation becomes less of a format wars conversation, which I think is kind of prevalent inside of at least technical content right now, and really focuses on just what are the right tools for different jobs and how do you mix those together.
So, I think that "structured authoring as a whole is going to be a combination of formats, a combination of technologies all working together" - Patrick Bosek to accomplish various goals and in a patchwork fashion. But, one that comes together to form a greater whole. So, in that fashion, I think we're going to see more XML, probably more markdown, more fully structured custom schema generally as a function of JSON objects, those types of things. I think we're going to see a lot less Word. I think we're going to see a lot less formats that are proprietary, or you know, maybe, I hope so anyway.
I love the concept of the format wars. It seems like the name of a sequel in there somewhere, but can you talk a little bit about the format wars?
Yeah. So, you have the group of people on one side who think that DITA and whatever their variety of XML is the absolute right way to do things. And then you have the people on the other side who are saying “that stuff's too complicated, markdown is faster, we should just use markdown, or restructured text, or one of the semi-structured formats.” And then you've got another group of individuals who are like “none of this stuff really does what we want it to do. We need custom schemas.” And those custom schemas used to be XML. Nowadays, they're JavaScript and they're JSON primarily.

I think that when you read down through the threads and through the conversations, people seem to think that it's one or the other. It's like, “oh markdown is the right way to do it” or “no markdown is not the right way to do it. DITA’s the right way to do it” or “none of those things do what I want to do.”

But the reality is that there's a place for all of them, and you actually see that in what the DITA technical community is doing. It's like they're actively bringing in markdown, they're making it so DITA can go to markdown and markdown go to DITA. And, that's because I think there's a great wisdom in that community to say, “yes, there are places where you need the full strict structure and componentization that data provides, and there are other places where you need things to be faster and looser, and it's really more important that they integrate more easily with developer tools.” So, the things that you can write in, what is traditionally viewed as developer IDs, and things like that.

But, at the end of the day, your customer or your consumer, whoever they may be, they don't care where this stuff was written. They care that it's accurate. They care that it’s consistent. They care that it all searches the same. That there is not weight being put on one or the other. And, they care that they're receiving the knowledge quickly, which is another way of saying they don't have to read a bunch of stuff they don't care about to get to the stuff that they do care about.

So, I think that when we look at the format wars, everybody is going to kind of wake up and be like, “there's a place for all this, it's all structured. It's a range of lightly, semi-structured up to very structured, and depending on the application, we're going to apply one of these technologies or a combination of these technologies, and we're going to focus on the end-result more, and less on what our dogma is in terms of which structure format we'd like best for our current objectives.”
Wow. Now, we're very aligned in this regard. At [A], we're working on the Master Content Model®, which is essentially a way to try to unify all of the various standards and markup languages and renditions of structured content. We look at the need to create sort of system-agnostic, even standard agnostic content models, owned by the enterprise that represent schema in a way that's independent of all of the representations that content takes along a long supply chain.

So, we look at semi-structured (or, sometimes we call it "pseudo structured") content as a good entry point into a supply chain that then moves forward into XML forms aligned potentially with DITA, and then further downstream gets enriched with other annotation and metadata that might be, for example, schema.org, various forms of schema.org markup which could be JSON LD or any other markup variant that can represent the same elements within a model.

I love this idea that it's not "this" versus "that" technology. It’s a “let's find the best representation for our given publishing need because our content ecosystems are far more complex than a single standard can usually represent and certainly a single authoring environment or a single publishing model”. We have to create systems that are sort of independent. Does that generally make sense with your vision?

Yeah. I’d really like to add on top of that. So, I do think we're aligned. I think that effectively what you're describing is what we prescribe to people. The thing that I would really add to that is, having that perspective is important, and then choosing your tooling around it in such a way that it really supports it is also really important.

So, one of the big things that we advocate for — well, actually, there are two really big things that we advocate for that I think line up with this: regardless of what repositories you pick, they need to be open. That doesn't mean they need to be open source software. It means that you need to be able to get your content in and out of them in an open standard, or at least in a well-defined internal standard. So, one that your organization has come up with seamlessly.

So, you can't have any tool vendor input into your content structure. So, if you choose a tool vendor that uses markdown, it should use the most vanilla form of markdown or your form of markdown. Period. It shouldn't be their version of markdown if you choose a vendor that implements DITA as a standard, it's got to be DITA. It can't be their DITA, or almost DITA, or based on DITA, it's got to be DITA. This is the benefit of these things, so if they implement anything through schema.org, it's got to be those things. So, there's that, and then building on that, regardless of what hat I'm wearing at any given time, so if it's just giving general advice to people in the industry, or having conversations and advocating for things, or wearing my easyDITA hat which is obviously, I'm a vendor.

The other thing I'll tell you is to stay away from vendor proprietary schemas. I know that there are vendors out there, like HAT tools and stuff like that, they have cool tools and they do cool things, and oftentimes it's very easy to do cool things with them quickly. But, be they a traditional desktop HAT tool or be they learning system or some type of custom wiki, eventually you're going to outgrow it or you're going to move away from it and it's going to be a problem.

If you choose to go with a vendor's content format, that has that as a time limit on it. And it's going to cause problems in the future, guaranteed, every single time. It's just a matter of how long it takes and how long it stands around and then how big of a problem it creates at the end.

Those are kind of the two big things for me. "Choose systems that stay open and can be used programmatically via the APIs and don't choose systems where you're adopting a vendor's content format." - Patrick Bosek
Right on. That's really refreshing to hear from a vendor. And I have to say it's also not common. I find that most of the vendors with the largest footprints within enterprise have a very strong incentive to create content lock-in, and it creates significant friction when we're trying to unify supply chains, when in order to get content from point to point we have to do a lot of manual transformation because of the unavailability of basic API methods. And so it's very important for vendors to consider the long-term picture of such a lock-in strategy. Can you speak to the vendors that are interested in locking content up in a proprietary schema? Why should they consider a change in strategy now?
That’s a fun question. Oh boy, where do I start with that? There's kind of the altruistic answer to that, and then I think there's the business answer to that. The altruistic answer to that is it's the right thing to do. You provide more value and you do it in a way that is better for the knowledge at the organizations you serve. And, the thing is, that's a win for your customers, which is ideally why you're in business, right? I guess you can be in business for a couple of reasons, one of them is purely to just make money, but another one is that you think you're putting value out in the world and people should compensate you for it. If you're in the second camp, "you put more value out when you don't lock people into something, and you make it easier for them to build ecosystems rather than trapping them inside of what are effectually silos" - Patrick Bosek, I guess, I hate that term and I could come back and talk for a whole hour on why I hate the term “silo”, but in this particular case I think it's appropriate. So, I think that there's that altruistic aspect of it. It's just the right thing to do for both your customer and for the vendor itself.

Then there's the business aspect of it. So, the business aspect of it is that this is where things are going, and you're either going to be there or you aren't. And I think that that can't be understated. So, if we look at what's happening in the marketplace right now, we see that structured content is kind of very, very horizontally infiltrating all of these functions that we weren't sure that it was going to infiltrate. So, it's becoming the underpinning for web publishing. It's moving into systems in ways that we don't expect. One of the examples I'll use from our customer base — and this is a really unexpected example, and I'm going to get back to how this ties into the business discussion, I promise.

So, one of our customers is the American College of Surgeons and they publish a book called the Cancer Staging Manual. It's the international authority on staging cancer. Everybody uses this to say, “OK you've got X-stage cancer” and then that translates into treatment, which is also in the book. In the past, it was a book, which we all know doesn't integrate with anything very well except for people. When they transitioned over to our product, easyDITA, it's now in an API, and the API can be consumed directly by the EMR [Electronic Medical Records], the EHR [Electronic Health Records] systems vendors. So, when doctors pull up their system to get the content that describes how to stage and treat cancer, that content isn't manually transcoded from a book into a database. It's a direct connection that's got consistency, update times. There's just a whole huge list of value around that. But it's one of these little areas where you'd never expect to be sitting in a doctor's office and have structured content be the pipe that comes in and generates the care plan that they're putting together for you. But it's there. So, it's happening all over the place and it's what's backing all this API delivery, which is the future of content.

This legacy notion of web-delivered content where it's like, okay you get WordPress, and then you go type stuff into WordPress, and then you pick a WordPress theme, and it shows up on your WordPress site. That's going away for the most part. I mean, obviously there'll always be WordPress, there'll always be some need for simple blogs, but it's too slow, it's not flexible enough, it's too hard to reuse your content for different deliverables like, blah blah blah. All the way up the toolchain to the bigger versions of WordPress.
Your big WCM [Web Content Management] software, this like type and render model, it isn't the right model and people are starting to see that. You can't deploy fast enough. You can't iterate fast enough. You can't reuse content effectively enough. You're locked inside of a vendor silo. You can't utilize things from other parts of the organization effectively enough. You can't have other parts of the organization utilize your content well enough. And so, it slows the whole company down. So, when you're a tools vendor, and you're looking at how you're going to put together your content product, you need to be thinking in terms of “we deliver content as an API, as a connection, as a framework.” Not, “we generate HTML pages that people read.”
Oh my gosh, yes. Everybody is in this life cycle of evolution in different parts of the organization. Of course, in the technical communications arena, structured content has been the norm in many cultures for years. But, in many other parts of the organization, topic-based authoring, structured authoring, in general, is not only not normal, it's completely foreign. And so there's this cultural shift and awareness shift that's happening across the enterprise. I believe that easyDITA plays an interesting role in that it seems like your product appeals to the onboarding of folks who are not traditionally used to structure.

What do you tell authoring groups that have only written in presentation-coupled environments? They're only used to creating marketing tactics or campaigns where the content has been built inside of a wireframe or a presentation format from the very earliest form. And now, all of a sudden we're asking people to look at content very differently and that needs to be decoupled. How do you help people make that mental shift from presentation-coupled content to modular, topic-oriented content?
That's a great question. I actually kind of feel like I should be asking you this question.

It's tough. The thing is we're kind of in early days, I think, for the shift for some groups. Like you said, this is the norm in tech pubs, and we were doing this for a long time and it’s been the majority of my career, which is every day a little bit longer, and we've been in tech pubs most of the time. So, I think in a lot of ways when you start talking to people who are in marketing, or even learning, or really any other info dev group, you have to talk in terms of what they're looking to accomplish now. Like, how their jobs have changed. And, it doesn't serve anybody to not be really upfront about the change really. Your jobs have changed. Things have changed, and at this point in time you're not ahead of it anymore. There's no way to be ahead of it because all this stuff is already happening.

If you want to be able to keep up from the perspective of rapid publishing, rapid iteration, being able to do app-integrated publishing, being able to drive more consistent messaging, being able to have completely presentation-agnostic content which is to say device-agnostic content, and being able to deliver that all intelligently, at this point in time it's still kind of in the, I would say somewhere in the early adopter’s phase, but we're very rapidly getting to a point where your only option is going to be to fast-follow. Because in every single industry there are marketing teams that are doing this. So, if you're not doing it this way, you're already behind some other people who are.
The ability to put messaging out that's going to be more consistent, have a more gradual slope from a pure value to technical proof is really big. What's traditional there? Traditional is, marketing people go and they put something up on the website that's meant to entice you. And how factually accurate it is can range. Obviously, we all try to be as accurate as we can, but really more of it is ad copy. It's trying to entice you to look at things, and do a little bit of light education and then immediately you're in the documentation.

So it's like this big step function, right? It goes from pure marketing content to doc-style content and nothing in between. And, the future is that ramp is much more gradual. You have your doc site and your website and all your materials as a more gradual curve, going from very high-level messaging to integrating technical proof along the way, along the customer journey, more gradually so it's easier to consume, it's easier to understand.

And this is one of the things that being able to share across these teams is going to enable people, and it's one of the reasons that you should be catching up, and you should be looking at this methodology as quickly as you can. So, I think there are a whole plethora of reasons to do this. I could sit here and list off reasons for the rest of this podcast as to why I think people should be moving in this direction. But, that's kind of where we start with us. So, it's like you're interested in structured content. Tell me why. Here's why I think you should be interested. By the way, it's going to be a change. This isn't business as usual. That's not the right version of the idiom is it? Is it business as usual? Sorry, it doesn't sound right, right now.
So, anyways but it's kind of new day stuff. So, look at it that way and get excited about it because it's going to be better. It's going to mean new things for you, new opportunities, new challenges, and I think the groups that love this stuff are the ones that are going to succeed. So let's love it and do it right.
I think we need to find ways to make those on-ramps as accessible as possible to the creative class. I mean, there's such an incredible variety of customer experiences that we're trying to create within an enterprise, and there's a lot of different kinds of people involved in the development of those experiences, across a lot of groups that are understanding of their role within Information Development and the need for those assets they're creating to be reused. There are others who see a lot of their role in communication is primarily ephemeral or campaign-driven or creative-focused with a one-time-render kind of mindset. And, they all have different tools. A lot of creatives are working just in Illustrator and doing mockups and taking everything from a wireframe forward, and every time any content is discussed, it's always discussed relative to the presentation.
And so I always am looking at, how can we capture that content and move it downstream in a way that allows it to move towards structure? Because, just to your point, where there's a graduated motion between pre-sales and post-sales content. And sometimes it's the product content itself. I mean we work with software companies where content is both selling the product and then it's in the product and it is the product. And then there's the supporting of the product and the post-sales content, but then there's also other parts of that lifecycle where the same content used on the marketing website is also used in a post-sales support portal to help to upsell additional product.

So, there are all these permutations of content along with that lifecycle. And I think it's incumbent upon the architects of these life cycles to help capture the content in the mode that our authors are used to and get them into our systems of record in as organized a way as possible. But then, to help other people to do additional structuring and annotation of the content so we don't have to rely on creatives to just blow up the way they work. This is challenging and really hard stuff that we're talking about here. But you're right. It's the kind of the essence of the change we’re within.
Yeah, I agree with all that. I think that what it's going to come down to the teams are going to look different. Right? And I think they always do in a big way. So, I think where marketing teams in the past were probably purely creatives, they're just not anymore. I think there are developers on every team today, or people who are more on the technical side of things. And, I think that in a lot of ways you kind of need that same transition for information architecture like content people, be they tech writers or whatever it may be. I think a lot of the stuff too kind of flows from a change in how people buy, and what it means to be a customer. So, when you think about your business from a very top level, it’s predicated on people, if you're a subscription company, staying and maintaining subscription, if you're not a subscription company, it's repeat customers.
In today's world, we don't have brand loyalty the way we used to in the past because marketplaces have changed. You can one-click buy stuff on Amazon, blah blah blah blah, all that stuff. It's very easy to find new brands and switch quickly. So, like, what's your wall? What's your competitive mode against your competition? And it may be a number of things, but one of the things that it definitely is, is it’s knowledge. It’s education. So, one of the things that I say here, and also to some of our customers when we get on this topic, is that "an educated customer is a more valuable customer, and is more likely to be a customer in the future." - Patrick Bosek. Because, if somebody takes the time to educate themselves on how your company does business, and what the best ways to use your software or your product are, and they come to really know those things really, really well, it's natural for them to continue using them. That's kind of the replication of brand loyalty. It's a knowledge loyalty to what is efficient for you because you understand how to use it.
So, when we're thinking about what marketing's job is, if part of marketing's job is to maintain current customer base, whatever form that may be, marketing has to think of its job as a knowledge transfer, as an educational job. At least partially. And, there are creative aspects to that, but there are also just pure knowledge, just pure documentation, just pure educational pieces of that. So, that reality has to redefine marketing, and go to market strategy, and total business strategy at some level. And, at that point at which it's redefined, it also has to redefine all the tooling and process and methods we use to accomplish these goals, which comes back to structured content, being kind of the foundation of this capability.
Wow. Yes indeed. What do you think is DITA’s role in this structured future? I mean I know that's a very loaded question for somebody whose company name has DITA in it. But, what is your sense of how DITA fits into a structured future?
I think I can actually probably give you a relatively fair answer to this question. Of course I’m going to have some bias, but I'll try to take my sales hat off as much as I possibly can.

DITA has a place. Probably at every organization that creates enough content, that they need the things that it provides. So, what are those things? DITA is largely presentation agnostic. It maintains linking structures really well. It does componentized reuse really well. It serves document style content probably better than anything else out there today. Unless you're looking at really specific things like S1000D, or journal authoring, whatever chats. So, unless you have a very specific use case, if you've just a general document use case, be it contracts, SOWs, proposals, long-form user guides, repair manuals etc. All this stuff that really is documents, not just every page is page-one style content, DITA is really good for that stuff. And so it kind of scales up really well, but it also scales down really well. So, it scales down to answer-style content, like micro-content, and it integrates with learning content really well too.

So, that's not to say it’s be-all end-all for content. It's really not, it shouldn't be the only thing that you're using. So, your great example is, we do some of our API documentation in DITA but we don't do all that DITA, like we use YAML for some things. I think it's YAML, it's whatever the open API standard uses. You'll excuse me, I'm a little bit further away from some of these things than I used to be. But we use a different structured format because it's the right format for that particular application. And then, they're basically custom schema, now they still decompile back to DITA for us because a lot of our tooling is set up around that. So, we use more or less custom schema for some of the other things that we do, because DITA doesn't describe that stuff very well. And we wanted something more lightweight or more semantic in the way that we're using it downstream. I think that there is still a place for unstructured too.

So, when you think about where does unstructured fit? Anytime you need to create something where the document is the thing, so it's never going to be anything but that document, like Google Docs is great for that. So, taking notes, sharing notes, collaborating on email responses, like, my notes for our podcast here. They’re in Google Docs and it's because I'm never going to publish them anywhere. There's no other system that’s ever going to connect to them. And the priority for collaboration and connection is my ability to share this with one of the people in our marketing department and have them come in and screw around it easily. So, the presentation doesn't need to be consistent, it doesn't need to be brand aware, any of those types of things.

So, I think that in the mix of content creation tools there's room for unstructured for certain applications, and then there's room for highly structured, which is where DITA fits. DITA is not fully structured obviously if we want to be really technical. And, then I think you look at the ways that developers will work with things that are either highly structured like open API spec text-style documentation, or things that are very, like, developer-to-developer communication. So, you look at like readme files, which are often written up in markdown for GitHub, that's the only place it's going. And it doesn't really matter how structured it is, it just needs to tell the developer how to install this package, like “that fits there”. But, it probably doesn't need to be chopped up and put onto a dev site or into an app or something like that.

So, DITA occupies the space where it’s long term maintenance of content. It's very scalable. It can scale up to being large publications and can scale down to being microcontent. And, I think when you get to a certain volume of content, it's really good for reducing the overall maintenance overhead that you see with it, due to things like management and automated publishing that tends to be quite good for what it is, and the ability to componentize it, and add in things like variables into your text and things like that which maintain reuse structures. So, at that level, DITA fits really, really well.

Then I think you can blend all these things in around it, and if you have an open system, all these things can work together. You can pull content even from Google Sheets if you wanted to into a system like easyDITA. You can ingest markdown, you can mix it with your DITA, and you can have tech notes that are written in markdown, and they show up in your user guide, or on your doc side, or whatever that may be. You can document your API in YAML open API or whatever it is. That's a section of your doc site and you can have links off to the procedural guide-based content which is written in DITA and is just more applicable to that particular implementation.

So, I guess it's a really long way of saying: it fits at the higher end of structure, and in content spaces where there's a document-style application. So, things that have tables of contents, and if you need that to scale up really well, either multiple products, or multiple languages or you need the ability to inject true semantics and document metadata into that content to serve downstream processes of it.
So there's an argument to be made that DITA itself need never be uttered inside of a marketing department. Not because it’s not being used, but because it doesn't need to be necessarily understood if our content capture is aligned with DITA downstream. However, if you were presented with an entirely non-technical Director of Marketing in an enterprise responsible for it, let's say they're a product marketer run where they're responsible for one particular product on a global basis, and they're used to working through IT or at least a marketing operations version of IT, for kind of any content presentation. What would you tell them about DITA as an orientation to sort of “why should I care?”
So, in some ways, I actually agree with where you started on this, which is DITA need not be uttered. I would start by talking about structured content, and I would talk about how that serves their messaging requirements and their messaging objectives, and inside of that conversation, I would like to describe where DITA fits, where JSON style custom schema fits, and where some unstructured might fit if that's the part of the mix. But if you're talking about doing holistic campaigning and you're looking to create messaging that's consistent across materials and endpoints and all that kind of stuff, whatever those things may include, the ability to develop content in a way where you can remix it and you can reuse it very, very efficiently is really important to these types of operations.
A great example is like landing pages. Now, I know a lot of the Web CMSs out there have relatively good functions for hauling pieces on landing pages, but a lot of this stuff gets pretty complicated, and it's still one of those things where it just doesn't totally fit all the time. So, it's kind of like, you can say “add three blogs to this.” So, like okay, if those things are blogs they can show up, or some other like very type-based abstraction like that. Whereas, if you have content, and you have a modernized team for delivery of this content from a web perspective, you can take and you can reassemble content in a DITA methodology, which is to say, DITA map, pull in these three or four pieces, and you can publish that and it can generate a landing page from a variety of pieces.

So, if you're messaging out to, let's say, industrial manufacturers or something like that, and you've got 10 different messages that you want to try with these organizations. You may have a bucket of content components, and inside of five minutes, you can create 10 different DITA maps lined up with these different components that will all render out into different landing pages. And then, you can go out and you can test this stuff and because you've separated the presentation and the actual content.

Your ability to set these things up and to iterate through testing and seeing what's resonating, it’s orders of magnitude faster. And I think that that's good for marketing but I think it's good for customers too because what you're trying to get to, is you're trying to get to a conversation with your customers that they understand. A lot of times the only way to do this is to try it and to see what they respond to. So, if your technical stack is limiting the number of things that you can try and the rate at which you can try them, you need to look at your technical stack. And, that's going to probably flow all the way back to how your content is stored and managed. DITA is an option for some of that, but that's not the only option. There are other ways to do it. A lot of teams do it in other ways, but the benefit of DITA is that same content can be easily rendered in other things.
So you can go out with white papers from that same content, or you can grab pieces of white papers. It doesn't have to be a blog. It can be a chunk of a white paper. It can be a paragraph. It can be a section. It could be one line. It could just be the images are reused, and you can put that into your landing page or vice versa. The ability to remix is as pretty infinite realistically. So, go to print, go to white papers, go to digital hard copy, go to instructor-led training. So, maybe some of the stuff that your learning department is putting out is really valuable from a marketing perspective, because you've found that when this is presented to people in the classroom, they really light up. They understand this. This is obviously valuable to them. Well, if that's locked away in a PowerPoint somewhere it's much harder to exchange those things easily. Whereas if that content is created in a structured format that can become PowerPoint or can become a landing page, the tool chaining around that and the communication is much easier.

 Leave a comment on that paragraph, @ mention your head of marketing, they would see it in their email, say “hey, we've done three trainings, every time we get to this slide, this piece of content, people light up. Maybe you should think about putting this on a landing page.” Grab that chunk, throw it in the landing page, publish. That whole thing, that whole toolchain between your training department and your marketing department, sharing content in a way where you're taking experiences from your people on the ground and moving that into a web-based public delivery of content, that whole thing takes 10 minutes. And, how would you do that before if you didn't have a structured system? Now obviously there are some aspects of features in easyDITA in that explanation but whatever system you set up should have that capability.
So, it's that stuff. "It's that speed, it's that capability, it's just how fast you can iterate and how effective you can be doing it. That's what people should care about. And, the people who care about that stuff are the ones who are going to be more successful." - Patrick Bosek. So, I would argue that especially marketing people should want to be in that category.

Okay. And, in the tech-comms world, this is an unusual conversation because DITA, marketing, are usually not uttered in the same sentence. And, so we're trying to find ways to create kind of a lingua franca around structure so that we can tie in with DITA, and tie in with all the systems of record that can handle DITA compliant XML.

So, let's look at the other side of the enterprise over in tech-comms where structure has been advanced for many years, and most organizations at least five years. But, there's a number of organizations that structure has been around in one form or another for upwards of even 20 years, back in very early days. I'm curious about the role of easyDITA within the overall landscape of CCMSs. How do customers end up choosing easyDITA instead of something like Vasont or Astoria or Ixiasoft? Also, on the tool side, compared to somebody purchasing a cloud-hosted or installed a version of something like Oxygen as an XML editor, and other XML editing products and IDEs. What are your thoughts about the technical communications landscape, CCMSs, and where easyDITA fits into that whole environment?
Sure. So I'll answer the Oxygen question first. So, we integrate with Oxygen. A lot of our customers, some percentage of their teams use Oxygen. I think that Oxygen isn't my preference in creating content, but I totally understand why it is for some people. It’s a good tool. If that's your style of content creation, like if that's the way you like to do it, they've built a good tool for that style of content creation. I would never tell somebody to stop using it because I don't think they should. Which is why we integrate with it, and you know our customers use it, and they use us, and it's good. It works out well.

As it relates to repository side, so like the other CCMSs that you mentioned, I get pushback on this, but I don't really tend to believe in like competitive research. So I don't really know what they do, to be totally honest with you. I hear things and I believe some versions of some of the things I hear. I believe that every system that's being purchased out in the marketplace probably does something well enough to validate its existence.
But, it's hard for me to make a comparison to our competitors. What I will tell you is what I think we do well, and our maybe approach to things, and why I think it has advantages over what you could imagine other approaches being. So, we have taken a very web-first approach to things. Our system is this vast system. It's delivered through a browser. You can install stuff, that kind of thing. Which is really nice from an IT perspective, it's nice from a perspective of standing it up, it's kind of one-click stand up from the customer's perspective, which gets them up and running quickly. There's not long install periods for setting up easyDITA and I think that's a benefit.

But, from a methodology perspective, I think we've really focused on like the 80% or 90% of what people do that's really important, with a slant towards collaboration. So, we do all the basic stuff. The full CCMSs in the marketplace — there's actually a standard for this that a number of people worked on — I think it's an Oasis standard. Actually the other co-founder of Jorsek, Casey Jordan, was one of the authors of the standard. Not the author, one of the authors. There's a standard that defines what a CCMS should do, and we comply with that. But beyond that, we have localization, and we've got high quality search, we've got repository management, link management, metadata management, enforcement of structure, yadda, yadda, yadda, all the basic stuff that you need to do to be of real CCMS that adds real value for existing. I would assume our competitors do, I guess I don't know for sure because I haven't really used their software.

But, on top of that, what we look for is how can we make things happen faster? Just whatever people want to happen, how we can make that happen faster? And the big areas we focus on in service of that goal are collaboration and APIs. Those are the things that we try to own. We try to do the best. Notice I'm not saying better, but we try to do the best we can. I think we do collaboration really, really well. I think that is kind of our crown jewel. I think we're a really strong system from a collaboration perspective. At the time of recording, unless something has changed that I'm unaware of, easyDITA is the only fully collaborative XML authoring environment. Period. Anywhere. Nobody else has that. So that's the thing that we do that nobody else does. And what that means is that you can open up a DITA map in easyDITA. You can scroll through that map and you can drop your cursor in any topic and start typing, even if somebody else is in that topic, or that paragraph, or that sentence at the same time. So it’s a Google Docs-style experience, in easyDITA. In DITA. In structured content. You get tracked changes, you get threaded comments, you get real-time collaboration and you get full contiguous document editing. So, you don't have to add a topic by topic.
And those things come together from the simple perspective of being able to like grab a share link and send it to somebody be like “hey, what you are doing.” Actually, I'll tell a funny anecdote about that. So, we're in the process of deploying this collaborative editor and one of our customers who's used it, or is using it now, they had an issue with their content the other day that they wanted help troubleshooting basically. So, it was a content issue, it wasn't a software issue or something like that. But, we still support that. So one of the things that we try to do is be good trusted advisers to our customers, and one of our customer success managers, or CSMs, said “send me a share link.” And they ended up being the same topic at the same time. And just talking through easyDITA as they were they were troubleshooting this. They didn't have to be like, “okay send me the file. Okay let me look at it. Let me open it up. Okay, I'll send it back to you. Here is the issue.” Share link, boom, open it up, there's the problem, here it is, all set, thanks.

So, just those little things that happen inside or outside of an organization allows people to collaborate better, and we're really proud of that and we think we do that really, really well. And then the other thing, the APIs. We try to be a good neighbor inside of content ecosystems. We will integrate with whoever needs to be integrated with. Our APIs are there, and if you look at three-point integrations, so you get your APIs, we got our APIs, you need a point in the middle to make them talk to each other, you should be able to put a point in the middle between our system and another system and use the full suite of content management tools that are in easyDITA from that third point to talk to another system.
So, that means setting up your own continuous integration servers, or connecting to a delivery layer, or connecting to an app, or your own software for content sensitive help, or whatever it may be, or a database you have. So, let's say you've got a database full of values and you want those things converted into variables that people can set inside of your documentation. We've got customers that reimport their entire parts database or pin database for semiconductor into easyDITA every single night. It's like millions of values.

EasyDITA scales really well, so I know we kind of market ourselves as being kind of fun and new, but we scale really well from an enterprise perspective. So, you take those millions of values, you put them in, they're all keys, people and everything's consistent. Being kind of a good digital neighbor inside of the organization. That's the thing we do well. We take a modern and good approach to existing inside of a company, and adding value where we should, and not trying to capture other people where we shouldn't. And we think we do collaboration really well, and we think that speeds things along for people in the authoring process.
There's a lot in there. Well, thank you. And I didn't realize there was a real-time collaborative editing a la Google Docs product in the space so that's good to know. So, we've really gone a lot further and deeper than we traditionally do on this podcast. Let's end with one broad question about tackling content sets that are moving to structure and your advice for somebody that is looking at a pile of content. Sometimes that pile is 400 pieces, sometimes that pile is 20, 50, 100,000, or millions of assets, and they're looking at that pile considering a move, a shift towards structure. What advice do you have for somebody in that position? What should they consider in their journey?
That’s a great question. I think there's a couple of big points I would make on that. So, I think the first one I would say is content first, tools later. So, I think you need to be aware of what tooling is going to bring to your overall infrastructure. But, I think these things need to be content led. I think you need to decide where content exists in your organization, which is sometimes to say “whose heads is it in?” And then you need to decide where it needs to end up, which actually is also kind of “whose heads does it need to be in?” And, you need to focus on how that content gets between those places, because realistically, that's content right? The whole purpose of content in content ecosystems is to move knowledge from one person's head to another person's head efficiently and hopefully without friction." - Patrick Bosek. And tools are going to play a role in terms of how people interface with transferring those things around and that kind of stuff.
But, in the middle of all that, you have to decide what content we have. What is the “isness” of our content? What is this piece of content? Is this a procedure? Are these things components? Do these things describe pieces of our product? Like take the content and abstract it away from “this is a document, this is a user guide” and think about it at a more granular level. Think about what these individual pieces are, and what they mean to people, and where they come from. And, do your homework, spend time on it. It's really important and valuable. And, if you look, I think I would have said five years on a future three years ago, I don't think it's necessarily two years, but in the very near future, this stuff is strategic.
You need to be able to really efficiently educate your people, and efficiently educate your customers, and you need to be able to do what I like to call “pre-customer education” really efficiently too. Which is another way of saying “you need to be able to message to your prospects”, but if you start to look at them as being pre-customers and your goal is to be transparent and educational with them, I think the way you engage with them and the way you deliver knowledge to them changes. So, I think that making that shift, because I think as buyers we've made that shift already, right? I think that we buy now more than we're sold to, and I think that in a lot of organizations, you need to catch up. And that catching up is really a function of the way that you create and deliver content to people, and then that has to come back to how is your content organized, typed, structured, assembled, delivered, and curated, maintained, and sourced.
So, start at the base. And the base is just, “what does the content look like? Okay, what tools should interact with this? Okay. What formats should these things as they are, as content units be? If you're like C-level or VP level, wherever, you're probably not doing this stuff by hand or yourself, but you should hire some content architects internally or externally. Consulting or not consulting. Or both. And, you should "make sure that the homework is done before you start buying tools" - Patrick Bosek, because I think all too often people are like, “oh we're buying a new learning management system,” or “oh we're going to buy this new CMS and this is going to solve our content problems,” or “we're going to buy Watson and we're just going to dump all of our documents into it and it's going to magically answer questions. Guess what. It's not. So don't do that.
Love it. Thank you so much for your time, Patrick. This has been a wide-ranging and deep ranging conversation. I feel like we could talk for hours more. So, thanks for surveying the landscape together, enjoyed it.
Yeah, I had a lot of fun too, anytime. So, talk to you soon Cruce.
Thanks, Patrick. Bye-bye.

Highlighted Quotes