Can AI Generate Views of the To-Be Architecture?
Here, we present part three of our blog series “Generative AI and Enterprise Architecture.” This series provides an in-depth perspective on how generative AI will reshape EA as we know it. To catch up on the earlier parts of the series:
- Generative AI and Enterprise Architecture: Impact on Enterprise Complexity
- Generative AI and Enterprise Architecture: Modeling the Enterprise
In this third part, Ed Granger and Ardoq’s Chief Enterprise Architect, Jason Baragry, tackle the big question of whether generative AI can help EAs and enterprises build better roadmaps and target architectures faster.
Axiom Three: Enterprise Architecture Must Roadmap the Future
Let’s jump straight into Axiom Three without a preamble: Enterprise Architecture Must Roadmap the Future.
It should be beyond debate that Enterprise Architecture’s fundamental role is to plot the enterprise’s course to the future.
It’s an activity that takes many forms: scanning business and technology horizons, co-creating strategy, envisioning future states of IT and business operations, and planning and scheduling the execution of change initiatives. It happens across a range of scopes, from whole target business models to regional application portfolios to individual project solution designs.
No matter how diverse the activities, they tend to manifest as a fairly predictable set of products: Roadmaps are by far the most-requested artifacts from EAs, and solution architectures — and their whole supporting scaffolding of principles, standards, patterns, and so on — are their tactical elaboration.
If you're hungry for more content on AI and EA, watch our webinar on-demand: Enhancing Enterprise Architecture with AI: An Ardoq Perspective
So, how far can generative AI help us in creating these views of the future enterprise?
There’s really not a lot of published research out there.
But there are some. For example, Gartner recently started publishing its own views on AI’s ability to automate architecture. A recent infographic on AI-assisted EA activities rated enterprise architecture activities—for example, "business road-mapping," "solution architecture generation," and "augmented EA governance"—for both business value and technical feasibility.
However, details on exactly how these will work are lacking. Without that level of understanding, there’s a risk that we will treat generative AI as a "magic box" that we assume will produce the right results.
But can it?
Knowledge needs to be based on experimentation. We ourselves have conducted a fair amount of research into automating processes like road mapping and solution architecture generation using conventional algorithmic approaches. Enough to convince us that most EAs hugely underestimate the extent to which these could be automated.
However, we didn’t really have a strong view of what generative AI’s capabilities were, and the best way to answer that was to get hands-on.
Building a Value Hypothesis
Before we started our experiment, we needed a hypothesis to test.
Creating a roadmap or a solution architecture is a complex activity. It needs multiple inputs and has multiple parts. Even with the impressive capabilities of LLMs, it seems unrealistic to expect them to fully automate the creation of these artifacts.
However, set against that is the knowledge that these are repeatable processes, for example, roadmaps and solution designs existing in the public domain. These are likely to have formed part of the LLM’s training dataset, as is information about business models and technologies relating to our task.
So, it’s reasonable to hypothesize that the LLM already has a level of relevant knowledge and that we can use this as an accelerator to increase our productivity.
We could test this hypothesis one of two ways:
- “Bottom-up” approach: We take a typical artifact like a solution architecture, break it up into its constituent parts, and test the LLM’s ability to generate them one by one.
- “Top-down” approach: Treat our artifact as a black box and simply ask the LLM to generate it for us, and then analyze the output to see where the gaps are.
We chose to go with the second approach.
Generating a Solution Architecture
We began by typing into ChatGPT, “Take the role of a solution architect: Generate a solution architecture document for a CRM application.”
It did a pretty nice job, with all the right section headers and helpful content under each of them. For example, under the Data Model heading, it listed the major data entities, and further down, it suggested the statutory regulations that will apply, in this case, GDPR and HIPAA.
However, this first generalized prompt didn't give us all the content we needed. Specifically, we were missing any information about the application integrations we’d need to build or modify. So, we needed to be more specific in what we asked of our LLM.
We ran a second round of prompting, which got it to list those application-to-application integrations, like email, payments, and ERP, along with (and this is a nice touch) the suggested integration protocols.
Now, of course, GDPR only applies to organizations that process data about EU citizens, and HIPAA is a US health insurance regulation. Because we hadn’t told GPT anything about our industry or our territory, it went ahead and made some assumptions. The document contents also referenced design principles, which were entirely missing from the body.
So it’s a partial credit: ChatGPT very quickly produced a great-looking document with some pertinent content.
Still, the GPT knows nothing about our particular organization. So the next step was to tell it about the applications we actually use.
We told it that our organization is using Microsoft Dynamics for CRM, Hubspot for Marketing Automation, and SAP S4 Hana for ERP, and it regenerated.
However, this time, for some reason, it also decided that we use Outlook as our mail client, which we hadn’t told it. It’s a reasonable guess, but what if we used Gmail instead?
So we then had to correct it. But this is the trend: From here on, we were in a conversation about the specifics of our organization, spending increasing amounts of time checking and course-correcting the LLM.
It got to the point where it would be faster just to type the remainder of the document ourselves.
Findings From Our “Top-Down” Experiment
One lesson of this experiment is that both solution architecture creation and roadmapping are surprisingly complex activities. There’s little standardization out there, so formats differ from methodology to methodology and organization to organization. But we can work with representative structures like the one generated by GPT:
- Document outline structure
- Public-domain reference architectures, commodity technology components, and patterns
- Organizational re-usable technology components
- Organizational design principles and patterns
- Functional and non-functional requirements
- Option trade-off analysis against organizational objectives
- Planning, sequencing, collaborations, and dependencies
In our test, the first couple of steps were massively accelerated by generative AI. But from there on in, the speed dropped, and the effort rose sharply.
We seemed to have hit some limits of the LLM’s innate capability, so it helps to understand why.
Understanding the Limitations
Lack of Specific Organizational Knowledge
The most obvious limit is that GPT doesn’t have any specific knowledge of our organization — the business and technology components we need to compose into our solution architecture. In fact, GPT even helpfully points this out in its response:
‘"Remember that the specifics of your roadmap will depend on your unique business context and goals. Tailor these steps to fit the needs and characteristics of your organization."
This is not an insurmountable barrier, because we can tell the LLM about these things.
In the previous part, "Generative AI and Enterprise Architecture: Modeling the Enterprise", we touched on how grounding techniques like RAG pass relevant organizational knowledge into the LLM. In effect, this is what we did manually when we told it which applications our organization uses.
This could be accompanied by contextual information about those applications, like their strategic ratings as well as the integration points between them. All this is still well within the boundaries of prompt engineering.
However, addressing the LLM’s knowledge cutoff problem is not enough to get us to a completed solution architecture. Once we’re over that hurdle, we hit a more fundamental limitation: We can’t assume that our target architecture will be the same shape as our current architecture.
This means we can’t just swap out old components for new on a one-for-one basis. Multiple factors, including integration patterns, the use of SaaS or cloud services, data residency requirements, and much more, may mean our target architecture does not resemble our current one.
This is because a target architecture requires not just a translation but a reinterpretation - a re-interpretation of the business model into a new technology model.
This means we need to reason about the shape of our architecture.
Are LLMs Capable of Reasoning?
Reasoning in LLMs is an area of both active research and debate. Some researchers claim it’s an emergent capability that comes with the size of the model, while others maintain it requires a fundamentally different approach to machine learning.
Researchers into cognition — human or machine — distinguish between different types of reasoning: deductive, inductive, and symbolic reasoning, for example.
Deductive reasoning, in which a conclusion follows necessarily from the truth of its premises, will be familiar to most people: Socrates is a man; all men are mortal; therefore Socrates is mortal.
It’s also one of the easiest types of reasoning to automate as it is essentially driven by inference rules, which can be encoded as algorithms.
Our own research has shown that the automated inference of a target architecture from a current one is quite possible using algorithmic approaches, even when the target architecture is a fundamentally different shape from the current.
The key is to understand that the target represents a reinterpretation, not a substitution, of the business and technology model. This process of reinterpretation can be automated via algorithms by introducing an abstraction layer like a logical integration model between the current and target physical models. However, our focus here is not on deterministic algorithms but probabilistic neural nets. And the question is whether LLMs are any better at performing that kind of reinterpretation than conventional algorithms or humans.
From brief tests, we’d say they aren’t — yet. The probabilistic approach seems, on average, less reliable than rules-based approaches, although this could come down to better prompt engineering.
We have concluded that a lot more experimentation is needed here.
But the pace of technology development means this limitation may, too, be overcome in the near term: Developments like Google DeepMind’s Alpha Geometry point the way to AI models that can simultaneously reason both probabilistically and deterministically.
If so, that could open the door to the automated creation of new architectures.
How Good Is the AI Architect?
So, what can we conclude about generative AI’s ability to automate To-Be architectures?
It’s definitely an accelerator. A significant proportion of creating a solution architecture or a roadmap is content generation. Since most technologies are commodified, with architecture patterns and reference models existing in the public domain, there’s a fair proportion of work the LLM can do to integrate this knowledge into a baseline artifact.
However, the twin limitations of knowledge cutoff and lack of reliable reasoning capability mean its performance drops off sharply as we get into organizational specifics.
So, we’re not yet at the point where AI can make informed choices about the shape of the To-Be.
Alignment Through Personalization
But was it really a fair test?
We’re not out to debunk AI here but to figure out where it adds the most value. There’s certainly a risk we’ve stymied the AI through our own failure of imagination.
We naturally work from a set of biases based on existing practice: For our solution architecture document, our objective was simply to reproduce the kind of one-size-fits-all artifact we’re all used to.
But in doing that, we’re not really leveraging the LLM’s ability to re-interpret, recombine, and personalize at the point of consumption. So how about we focus on using the LLM to determine how the content is consumed rather than how it is defined?
Take these prompt examples:
"I’m the application manager for the Payments Gateway. What do I need to know about this proposed solution architecture in terms of potential compliance risks?"
"I’m an integration developer charged with integrating the new CRM application with our payments gateway. Help me get started with an API message specification."
At each point in the To-Be process, our LLM can act as a translator, a personalizer, and an accelerator. This gives us something even more valuable: engagement and buy-in from the many impacted roles we architects need to take with us on that journey.
Ultimately, the real power of generative AI in driving To-Be architectures may lie in building consensus.
The fourth part of this blog series looks at whether AI can aid EA teams with producing outcome-oriented architecture.
If you're hungry for more content on AI and EA, watch our webinar on-demand: Enhancing Enterprise Architecture with AI: An Ardoq Perspective
This blog series has been co-authored with Ardoq’s Chief Enterprise Architect, Jason Baragry.
- Generative AI and EA Series Generative AI and EA: Impact on Enterprise Complexity Generative AI and EA: Modeling the Enterprise
- Blog Posts Leveraging AI for Strategic Enterprise Planning The Advantages of AI for Digital Transformation
- Webinars Enhancing Enterprise Architecture With AI | 29 February 2024 | Recording The Future Is Now: How AI Will Power EA | 6 June 2023