Welcome to Towards A Smarter World. This is your host Cruce Saunders, and I'm joined today by Michael Priestley. He's an information architect, taxonomist, content technology strategist and product owner working across IBM's marketing analytics team on taxonomy and information architecture all across IBM's many websites and content systems.
Michael has experience working with and across documentations, support, training, sales, marketing content as an enterprise content technology strategist and somebody that helps to tie all of the pieces together in a dispersed team. He is an IBM senior technology staff member and was named an OASIS distinguished contributor in 2017 for his contributions to the DITA Darwin Information Type Architecture content standard. Welcome, Michael.
Thanks very much for having me.
So glad that you're here. We've known each other for some years and this is really our first chance to talk in depth, and I'm glad we get to share this with our podcast audience. We have so many things we can cover today. You've had a history in enterprise content that rivals really anybody that has been wrangling these kinds of challenges and have seen the dark side of the content challenge from several seats.
And you've also seen a lot of the possibilities and potential that comes out of collaboration. That is an exciting insight that I really hope we'll be able to get across to our listening audience today. For those of you that don't know your work, can you talk a little bit about your journey at IBM and your adventures in structured content before you arrived in marketing? Also be good in there to define DITA for folks who are not familiar with it.
Sure. And that's certainly been a big part of my career, although not recently. I started at IBM fresh out of university about 25 years ago. And as a technical writer documenting very technical products like compilers, lectures, API class libraries. And I was working as the information architect for one of IBM's new products in that area, when I got the opportunity to participate in an initiative to define our new source standards and content standards across IBM, where previously they'd been an SGML flavor and we wanted to move to XML.
IBM has got a huge history with structured content and markup, including Charles Goldfarb, the inventor of XML, sorry, of SGML being an IBM'er. And we wanted to do the right thing and we had lots of use cases for it. And I had some history with single sourcing and with managing links with tools that I created myself for my own use. And I imported that into this work stream.
And those became parts of this new standard called DITA, or new at the time, which was based around the idea of modular chunks of information that you could pull together into different assemblies with the appropriate links and navigation. It was really a major use case for the kinds of user assistance that shipped with products, embedded assistance, help systems, and we would produce from that same content organized and linked differently printable PDFs, embeddable assistance and also web pages.
And so we managed to get all of the IBM technical writers who at that point had split into more a PDF community versus more of an HTML community. Get them back into one shared system with 1,000 plus writers all working on the same standard and producing consistent high-quality technical information that adhered to our own internal standards using the standard. And one of the mistakes I think that we made with the SGML flavor that preceded it was keeping it proprietary.
And we'd started with something that when we rolled it out was really cutting edge. But then it stayed IBM only and gradually the industry got ahead of us. We watched how other standards with continued investment and involvement from a broader set of participants gradually outpaced what development we could maintain internally. And so we decided next time around we want to be part of the party. We don't want to be off in our own corner.
And so we contributed the DITA work to OASIS and maintained a strong involvement for many years and continue to do so today. That's getting up to the DITA stuff, should I keep going past that?
Well, you know, I think it would be interesting to connect to your marketing work in the current day as well, because that bridge between technical communications and some of these technical acronyms, like SGML and XML, DITA and other things that are managing structured elements for the portability of content sets. Now that same thinking style is helping in some of the work you're doing an information architecture. So I think it'd be good to help bridge to the present and help tie together some of what you learned in the past with some of what you're doing in marketing.
Sure. So once DITA had taken off within the technical community, and I'd really moved from being an information architect at the product level to being part of the corporate standards and tools team, that team's mandate started to expand because of our success. And that gave me the opportunity to start working within across all the different content areas within IBM, including marketing and sales and training and support.
The challenge was as our scope grew, I think our influence gradually diminished and was diluted and eventually that mandate was absorbed by another team. And I was left behind in the enterprise space where I'd been working on CMS strategy and enterprise content management strategy across IBM. And so that role continued, but really with more of a marketing focus, became the product owner for a new website and then moved into taxonomy work, which I had some history from as well in the post-sale space.
And now my current role as information architect, where once again I've got that increase of scope. I'm looking across all of IBM, but at least so far, I've managed to maintain some influence. And I'm hooking up with people who are able to and willing to work with me on applying the standards, not just defining them. And in terms of the relationship with DITA, I think a lot of the same mindset as there.
I certainly think the same way in terms of connecting content to user needs and thinking in terms of structure and trying to be repeatable and organized and transparent and useful. The actual elements of the information architecture are slightly different, but there's a lot of the same sort of thinking behind it.
I certainly have the long-term goal that once we move beyond the page to page relationships that I'm really focused on right now, into the content of the page and the individual modules of content and the structure of those pages, there will be an opportunity to reintroduce the idea of not only standardizing those content models, but externalizing those standards and bringing that back to the DITA community.
Right now, that is much more of a long-term hope than an immediate task. My goal right now is just to get everyone agreeing on how we're going to manage the IBM.com web experience in terms of its navigation, linking and metadata.
You know, this bridge between internal standards and public standards is one that IBM has navigated over the decades incredibly adroitly. We have a similar conversation happening in a lot of enterprises today where internal content standards within one department need to be federated externally within a company. So that one niche content model, structural model for, for example, a marketing-controlled website might also overlap with, for example, the technical content model that is used in a customer support portal.
Which today is often siloed and independent and run in an environment fed through a CCMS. So that our web content management platform or our digital experience management platform is somehow separated from our content portals that deliver customer support content fed by something that runs in a more structured setting. So the fact that you're bridging that world using a standards-oriented mindset I think bodes well for IBM as an organization.
It's something we certainly recommend that all enterprise content owners consider the path between internal standards within a department and how those connect to the wider ecosystem. So it's, I think, a treat for everybody to get to hear some of how you've connected those dots. Now you're applying standards on the semantic side to information architecture, and I'd love to dive into that more. But first of all, for people that don't know, let's just start with the basic definition for information architecture. How would you define information architecture, especially focused on websites?
Yeah, there's a lot of definitions floating around there. So we actually ended up crafting our own definition as part of the input to the work to make sure that we had agreement on what we were talking about. We all agreed that we needed it, wanted to make sure we all agreed on what it was. So the working definition we used was a set of models that guide the application of taxonomies and other standards for governing, navigation, linking and page identity or location that improve the customer experience. –@ditaguy
And just to unpack that a little bit. So a set of models, it's not just one standard, it's probably a set of standards. And it's not the taxonomy, it's how we're going to apply the taxonomy. So it's working at this sort of very broad and general level, but driving requirements on how we're going to govern and connect pages through navigation like menus and linking like internal links or CTAs or breadcrumbs as well as where pages should live with the ultimate goal of making our customers' and potential customers' lives easier as they try and find and get the information they need from our website.
And this tie together for information architecture adds an exposition of a larger model, which is ultimately semantic. I would consider taxonomy in the realm of semantics. And so essentially that definition sounds like it's bridging between a semantic standard and the representation within the navigational experience of the user. Is that a fair characterization or how would you elaborate that?
I think it's more than fair. What we're actually trying to do is take a, I think that the term of it is a topic clustering methodology, which is an SEO term for basically linking related pages to a pillar or a central page for a term of interest. And we're taking that approach and really trying to scale it to an enterprise level where we can say for each topic of interest to our customers.
And those topics are ones that we are gathering and managing in an ontology. For each of those values we want to have a pillar page that clearly lays out IBM's perspective and point of view on that subject and act as a hub for linking to the various other pages that might provide a unique perspective or additional depth.
And so that's a direct expression of semantic relationships where we have not only pages that are classified as being related to a semantic value, but we're also saying one of those pages actually instantiates the semantic value. And the linking to that page actually is an expression of classification or membership in that subject area.
And we're applying that same approach to not just general topics of interest, but also more controlled vocabularies like product names, for example. So we can say all of the information that's relative to or related to a product we want to make sure is accessible from a home product page and also vice versa links to that product page.
So your example of the support experience being siloed from the marketing experience, we want to make sure that there is a product page that provides clear access to both pre-sales and post-sales activities and could be the permanent home for product actions and interest throughout a customer's relationship with the product. –@ditaguy
It's quite beautiful. The image you're painting is one of object-oriented content in which that product becomes essentially a form of hub for other related interactions that a customer might have with that content, which it sounds like the IBM team is organizing in semantic terms.
So that each one of those related concepts may be addressed in relationship to not just on a single page in a single place, but wherever those relationships exist may be represented in close relationship by having proximity within the semantic model that governs it. Is that right?
Yes, it is. I've been struggling with how to label the kind of information architecture work we're doing. At various times I've called it interest-driven because that really is the first thing we're doing is understanding what's bringing people to the IBM.com website. And then we're creating pages and relating pages to organize the website around those interests. I might point out that this is dealing with a website that has tens of millions of pages.
It's a sprawling monster of a site and it's in constant daily use. So it's not one of those things where you can just burn it down and build something bright and shiny from the ashes. We have to figure out how to renovate the website thoughtfully and intelligently to increase and reorient it without breaking its existing use. –@ditaguy So that's one of the challenges. But yeah, interest-driven, object-oriented, topic-oriented information architecture. I think those are all useful descriptive terms for it.
That interest association is something that we've seen in repeated theme with our clients working on personalization initiatives. There is an almost essential pivot around interest or intent.
That becomes semantic as well. So we're working on intent taxonomies as a way to organize relationships between topical nodes within a taxonomy. Topical parts of our content domain and the customer's intended interaction that's also an association or a tag applied to content set so we can put those topics and intents into some sort of relationship. Does that sort of mindset exist within the thinking of the IBM.com team?
Absolutely. And yeah, one of the key catches for this work is are we have a team working on personalization strategy. So one of the first relationships we built was with that team. And the way I've been characterizing it is the information architecture is trying to develop a standardized street grid for IBM.com.
And then once we have that standardized street grid with labeled streets and an ability to navigate from any place to any other place, then the personalization engine becomes the route finder, emphasizing where you need to go next to get to the places that you need to reach. And the route finder is useless without the street grid. But the street grid also has value in its own right. It's not just an engine for personalization, but it is a precondition for it.
This reminds me of the application of graphs to way finding that has been in discussion within data science circles for some time. It seems like a portion of the AI and machine learning community has settled on graph structures as being the most nimble for that kind of way finding traversal. That begs the question about the evolution of taxonomy work toward formal ontologies and ultimately machinable knowledge graphs.
I believe you've done some work with our mutual friend, may he rest in peace, James Mathewson on IBM.com. And as I understand it, there was some involvement of a formal ontology with the objective of search engine optimization being one of the outcomes of that work. But certainly with a lot of other implications and information architecture, content discovery, content recommendation in general. Could you tell us a little bit more about that project and how the taxonomy work is related?
Absolutely. The ontology of topics or interest that we're working with is actually a direct descendant of James's work with Dan Segal, who continues to work on the project as part of our broader SEO focus. And it's already in use in multiple different ways within IBM, some of them customer facing as filters and menu choices, others that inward facing with analytics where we can track a particular set of customers that have journeyed through not only pages, but also topics and topics of interest to them.
And a key part of that, so there's two parts to that project. One is the ontology itself, which describes what are the topics of interest, what are the keywords that feed into that topic and cluster that topic. And then the second part of it is a Watson-based tagging service that inspects the content of a page and then maps it to one or occasionally a couple of those topics.
And that's what gives us the analytics reports internally, as well as effectively a way to tag pages at volume and at scale since again one of the challenges we've got is if you've got tens of millions of pages and you want to build a linking and navigation strategy that depends on metadata, how do you get the metadata there? You're talking about in some cases, really old pages that still need to be up there, but don't have a lot of maintenance resources available.
So automated tagging with a high degree of quality and relevance is key to getting the metadata available for the page to then automate linking engines, navigation engines and personalization. –@ditaguy
Interesting. This is a machinable graph at the basis for information architecture and customer experiences. And I think that theme will reverberate over the coming decade in ways our listening audience will find incredibly compelling and interesting. So it's certainly worthy of note that we're dealing with concepts that I believe will really underpin a lot of the future for intelligent customer experiences.
You talked about a lot of the machine intelligence here. I'd love to kind of reel us out of machine and AI ontologies and into a human dimension, which is always the denominator for every project that results in a digital output. It's always a very analog human set of activities that make that happen.
We'd love to understand your perspective on working across silos, especially in a giant organization that literally is dealing with tens of millions of content renderings, pages, assets. So no one person or one group could ever hope to represent all of the knowledge domains within that content set. How do you put together a conversation between stakeholders about content structure, about semantics that touch multiple people, multiple departments?
Right, and I mean, no lie. It's tough. That's the greatest challenge of operating at the scope I'm operating at right now is that historically we've got a lot of incentives as an organization for teams to worry about their own success ahead of the success of the company. And I think any large company has that tension. There's a couple of things that are proving useful this time around.
One is simply that I've got my own history with these groups. So I have a lot of context, I've got some credibility with the organization from my history, and that lets me in the door to start the conversation. The other thing that helps is my home right now is in the marketing analytics organization, reporting up to the CMO, which has a lot of credibility within IBM and also is the source of analytics on page performance.
So when we start telling people we're going to be reporting on how well your pages do based on, among other things, how well they conform to this new information architecture, and that's going to go up to your VP and all of a sudden that matters, right? Because we're controlling the way success gets measured. That's a degree of integration that I did not have previously in my career, and I'm certainly benefiting from it now.
And a huge credit there goes to the leadership in marketing analytics, whose focus is on driving customer value first, foremost and as its major structural principal. And that's what's driving everything else. I mean, we talk about ontologies. We talk about cognitive tagging. We talk about knowledge graphs. There is no particular focus on that.
What there is focus on is what do we do to get the content focused on the customer? And doesn't matter how complex the answer gets as long as it's making it simpler for the customer, that's the focus. And they're willing to take on complexity, they're willing to take on work if it makes a difference to the customer experience.
That also as a principle carries across, so when I talk to these other groups, again, the first focus is this "will help the customers and it will help you drive the conversation about how it's helping your customers using our reporting tools?" –@ditaguyAnd then finally, one little trick that I've been using to just quickly set the bar as to where we are today and where there are problems.
I'll show them a mockup or a diagram of like here's all the pages that are relevant to a topic that you care about or a product that you're working on, the different parts that we'd expect and how we'd expect them to link together. And yeah, that makes sense. That's an obvious good thing. It makes sense. We should do that. Then I say, "OK, well, let's Google it."
And I bring up Google and I search on that topic with a site colon IBM.com restriction. And I then look at the half a dozen to a dozen pages that are showing up and how many of them are actually linking to each other and how many of them are just completely orphaned entry ways to that subject with no cross linking or cross integration.
And we've got a history as an organization of publishing new pages and forgetting the old pages, and that the old pages are where the search rank is accumulated. And people have been linking to those old pages and they've got credibility and they've got Google Fu. And we're not taking full advantage of that within IBM when we only focus on new content and new pages.
So that's a way for me to bring the conversation back, to look at the whole space. We agree on where things should be. It's really easy to see how bad it is. So let's start fixing it. And that helped start the conversation.
Wow. OK. So a couple of things to unpack there. One is UX side of the conversation. There is a part of this lifecycle you're describing that involves visually showing what kinds of pages are impacted by content relationships and what that user experience looks like. That really helps to make it possible for folks dealing with customer experience and streamlining the effort out of those customer experiences and making things as organized and accessible and topically relevant to customers as possible.
Building bridges between the engineering and strategy artifacts like the taxonomy and its various representations and turning that into something visual, something that people can get their collective creative energy around. Can you dive a little bit more into your interactions with UX stakeholders? And how others dealing with these taxonomy and semantic sources can better visualize how those can help what they're trying to accomplish for the customer? And further, the work that you and your team are doing.
So one thing to keep in mind is I'm trying to limit my scope right now, which means that I'm only dealing with about half a dozen different web management systems and UX teams. And some of them are more strategic than others. I do want to say that I would not be getting anywhere without the support that I'm getting from like the IBM.com design team and their UX folks who are fantastic.
And I'm effectively acting as an additional member of their team just with a foot in analytics as well. And when I'm talking about visualizing the page relationships, I'm often doing that with a very abstract diagram just with URLs filled in or page captures filled in. So I'm just talking about here's our existing pages and how they should relate and here's how they actually relate.
And one of the reasons I do that is to avoid stepping on the UX toes in the different areas. And I know we need to get to alignment on how the links and navigation get expressed, but it's like the first thing I want them to do is agree structurally there should be links. And once I get them to agree this is a good thing and we need to do it, that gives me the leverage to force the conversation on how we should do it in a consistent way.
If we start with like let's all agree we need breadcrumb links that appear in this way and in this stage for this kind of information, then we basically get hung up on the argument of how they should look and people give up. If we get agreement that we need the links and it's just a question of how, then people have more skin in the game and they've already got an agreement to put the link.
And now it's simply a matter of getting to consistency. And so I'm trying to break that decision into a couple of stages so I can get the first win to build momentum towards the second. If that makes sense.
It does. And what about the analytics side of that conversation? Are they in the room or are those separate conversations?
So the analytic side of it is multi-faceted. There's a big analytics team with a lot of different focus areas, including search and SEO. And so the reporting dashboards and measuring campaign success versus page journeys and research. So some of them are in the room all the time. Some of them are in the room like once every two weeks for like a sprint review.
The SEO folks are probably the ones I work with most closely. And James Pate from that team has been close to a co-lead for the start of it and continuing to help keep us honest and focused on the actual terms of interest to people. Not just branching out into something abstract and independently aesthetically pleasing, but not grounded in actual customer interest.
In that respect, the analytics of what people are looking for and what effect changing our pages has on their success, that's a constant relationship with the marketing analytic teams on both sides.
Interesting. I feel like just like the semantic relationships between topics and intents is precursor to the success of an information architecture that's expressive and flexible and compelling to users and technical systems mediating. It also seems like the connections between stakeholders within the digital team is equally a dependency for an integrated, successful effort.
Yeah, absolutely. And I mean, what we're really doing is using the fact that reporting has that cross IBM scope already and that cross IBM credibility to help leverage awareness and adoption of other standards that improve their page performance. So it's absolutely relevant, it's not just saying, "These are the pages you've got that are performing well, and these are the ones that are performing poorly." It's saying, "These are the pages that are performing well, and we think here's why.
And here are the pages that are performing poorly, and we think here's why." And also starting to think of not just reporting on how individual pages are performing, but how they're performing as part of an information space that the user is trying to navigate. And how easy are we making it for the user and where are we forcing them to give up and forcing them to go back to Google? And so the lines between silos are drawn by areas of common interest from the customer point of view.
And so we're saying if the customer is interested in both of you, you need to talk. You need to be coordinated because the customer is going to be looking at both you and if you're not coordinated, they're the ones who get exposed to that difference. We can't keep publishing our org chart. But that's not my original.
No. In fact, that is the number one information architecture strategy is basically read our org chart.
Yeah, absolutely. I just want to be, oh man, I forget who said that first, but it wasn't me, just transparency.
It's so true though, I mean, interdepartmental siloed-based authoring processes express themselves in information architecture far, far too often instead of customer need intent and topic-based discoverability. I think the scenario that I'm imagining will be very prevalent in the future will include flexible information architectures that evolve along with customer behavior that may represent themselves in different permutations, depending on logged in customer data or even implicit data about customer browse behavior and intent identification.
So that information architecture itself may become intelligent and flexible. Have to be careful because like you said, there's a street grid and then there's a navigation strategy. So that's important that there's some source of consistent truth for a base experience. But I do think there's an opportunity for machines to present content in ways that are very useful to reduce customer effort, including navigational options that may evolve. Do you see that happening in the future?
Oh, yeah, absolutely. And yeah, very near future. And again, to come back to that street grid relationship, when we're capturing this metadata on the pages and then driving the links from that, it also means that when we record people's journeys and can say, "OK, someone came to our website and they were interested in these things and they went to these pages."
And then we start to get a sense over time of what pages work well together and have worked well together. And it's like the Netflix model of what you should watch next based on not only the metadata relationships between articles or movies or whatever, but also based on what other people have been interested in. And we start to compile an understanding of people's commonalities as well as their unique progressions.
And so capturing the metadata on the pages makes our understanding of the user actionable. –@ditaguy And I apologize for using that word. And we're able to act on that information to help users and present guidance where if we didn't have that history to start with, we'd just be personalizing a bunch of hypotheses. And that's great.
But you've got to measure the result of the hypotheses and feed that in and improve your hypotheses. And that's where it's not just the personalization but the analytics to feed it and keep it honest and keep it improving for the sake of the customer is key –@ditaguy
I love that. I think it's a good place to end, we've really ended up talking in a very comprehensive way around customer experience management and IA, UX, content strategy, content engineering terms. We didn't get as much into the standards side of the conversation as I was hoping to, but I think we covered a lot of valuable material.
I'd love to leave with a conversation about the skillsets that support the kinds of activity you are currently seeing leading digital initiatives within IBM and how either career professionals with longtime technical communication skills can repurpose into these kinds of areas or new folks entering the job market who may have a background in either library science or some sort of technical communications baseline.
Unfortunately, there's very little of that, but they're entering the space trying to figure out how do I build my skillsets? Or how do I transfer the skillsets I have into these next generation sorts of digital projects? Do you have any guidance for folks in their own career development? What sorts of skillsets are underserved? Where are the opportunities? And how can people build awareness in these important areas that you've spent decades learning how to work through?
Yeah, I wish I had a more focused answer here. The reality is my own career has been I'm constantly attracted like a moth to the flame to the idea of enterprise wide information architecture or content alignment. It's a hard row to hoe and I don't know, I keep coming at it from different angles.
I'm continually optimistic and I'm optimistic now. That said, I don't know. The path I've followed isn't a particularly repeatable one for getting into. Certainly, structured content remains extremely relevant to marketing, content management, all of those things, and to web design. Leaders like Karen McGrane in that space have made the case sort of very strongly and clearly.
And so I think that sort of content strategy and including IA remains, if it's not a growth area, it darn well should be. Well, put it that way. That said, I think from a pure technical writing perspective, one thing that I'm seeing more of, which I celebrate on the marketing side, is a recognition of the need for deeper, more technical, non-fluffy, helpful content.
So effectively the kinds of deep guidance and user assistance in the pre-sales context that has been the bread and butter of quality technical communication for decades. And so I think one point to make is I think that need is still there. And I think recognizing that need is a marketing need as well as a technical writing need. I hope that recognition is spreading.
I think one of the things that helped within IBM with that was our journey analytics that showed us just how much of that content was needed and used, leading to an emphasis on creating and managing it. But yeah, I don't think everyone needs to go for information architecture or shoot for that cross-silo bridging goal.
But if you're looking for opportunities beyond technical communication and you're good at writing, and you're good at understanding complex material and turning it into understandable prose that's clear and concise and user focused. I think there's an increasing recognition that that is valuable marketing content as well. And don't be afraid to consider those opportunities.
Terrific. Yeah. I believe there is going to be a lot of transformation in the education system in order to accommodate these kinds of skillsets: content strategy, content engineering, content operations, taxonomy design, ontology development, information architecture and all of the other related disciplines that lead to next generation intelligent customer experiences.
You have lived this challenge of building customer experiences at massive scale in the tens of millions of content items and working across lots of teams. Your comment about constraining scope is also very telling because in the face of that complexity, the only way to make progress is through patterns and then constraint.
Applying those patterns in a constrained and focused domain in order to prove their value and then move on. And you're teaching our listeners how to do that as well as I think all of us who have been party to this conversation today will take away a lot of other lessons. Michael, thank you so much for spending this time together today.
And thank you. I always enjoy our conversations and it's always a journey and I've enjoyed it very much. Thank you.
Thanks, everybody. Until next time, one step at a time Towards A Smarter World.
This episode of Towards A Smarter World is brought to you by [A], the content intelligent service. Learn more about intelligent customer experience powered by content strategy, engineering and operations at simplea.com.