Overview
Amidst the pandemic and the series of lockdowns in Melbourne, I found myself with ample spare time at home. Utilizing this opportunity, I decided to share the knowledge I have accumulated throughout my two decades of experience in the integration domain. As an architect, I have had the privilege of working across diverse industries, including Telecommunications, Retail, Logistics, Financial Services, and Scientific Organizations, both in Australia and Europe. During my engagements, I noticed a common gap in many organizations – the absence of a comprehensive Reference Architecture specifically tailored for the Integration Domain. In this article, I aim to address this gap by presenting an exemplar Reference Architecture for the Integration Domain.
What is a Reference Architecture?
The definition of a Reference Architecture According to Wikipedia is:
“a reference architecture in the field of software architecture or enterprise architecture provides a template solution for an architecture for a particular domain. It also provides a common vocabulary with which to discuss implementations, often with the aim to stress commonality“
https://en.wikipedia.org/wiki/Reference_architecture
Implementing a reference architecture in an organization brings about numerous advantages, including:
- Establishing a Common Language: By providing a standardized vocabulary, reference architectures enable stakeholders to communicate effectively, ensuring a shared understanding of the architecture and its components.
- Ensuring Consistent Technology Implementation: Reference architectures promote consistency in implementing technology solutions across projects and teams. This consistency helps minimize duplication, reduces development time, and improves the overall quality of the solutions.
- Encouraging Adherence to Standards and Patterns: Reference architectures define common standards, specifications, and design patterns that guide the development process. This encourages adherence to best practices, improves interoperability, and facilitates easier maintenance and integration of software systems.
By leveraging a reference architecture, organizations can leverage established knowledge and experience, accelerate development efforts, and enhance governance practices. It serves as a valuable tool for aligning technology decisions and achieving consistency, efficiency, and scalability in software development.
Component Model
A component model presents a comprehensive view of the components comprising the reference architecture. While the component model below encompasses a fully featured integration platform, it is important to note that not all organizations may require every component. Therefore, customization is essential by colour coding the component model to indicate the availability of components within the organization and those planned for future implementation. In order to streamline the model, components that are not actively under consideration can be excluded.

Component Model Layers
The following section provides an overview of the layers and essential components comprising the Integration Reference Architecture, collectively forming an Integration Platform. The primary objective of an Integration Platform is to establish an infrastructure that facilitates the seamless connection between data and service producers and the corresponding consumers who require them.
Consumers Layer
The Consumers Layer represents the various consumers of an organisations data and services which occurs via the integration platform. This may include:
- Internal IT System (On the same network)
- External IT Systems (Same organisation but outside the network)
- External Partners
- Websites (internal or external that may make use of your APIs)
- Mobile Apps (internal or external that may make use of your APIs)
- Field devices
Integration Layer
The Integration Layer is comprised of a collection of technology components organized based on their shared functionality. Recognizing that reference architectures can sometimes be too abstract for stakeholders to relate to, I have endeavoured to utilize real-world component names that resonate with their practical understanding. This approach aims to bridge the gap between abstract architectural concepts and tangible, relatable components that stakeholders can easily comprehend.
Component | Description | Sample Products |
---|---|---|
API Management | API management refers to the set of practices, tools, and processes used to create, document, publish, secure, monitor, and govern Application Programming Interfaces (APIs). API management aims to simplify the development and management of APIs, enabling organizations to effectively expose their digital assets and services to internal and external developers, partners, and customers. | 3scale by Red Hat (Open Source), Amazon API Gateway, Apigee (Google), Azure API Management, Kong (Open Source), MuleSoft Anypoint Platform |
Middleware Services | Middleware Services, also known as an Enterprise Service Bus (ESB), encompass a collection of software components that enable real-time connectivity between data consumers and providers. They offer a range of capabilities, including: 1. Message Broker: Facilitates the exchange of messages between distributed systems, supporting asynchronous communication. Messages are stored in queues until retrieved by the intended recipients. 2. Routing: Determines the appropriate data consumers based on predefined rules and content, allowing for dynamic routing of data. 3. Protocol Translation: Transparently translates between different communication protocols such as HTTP(S) and JMS, ensuring seamless interoperability between systems. 4. Data Transformation: Enables the conversion of data from one format to another, ensuring compatibility and seamless integration between disparate systems. 5. Cross Reference: Supports the creation and management of relationships and code conversions between applications being integrated, ensuring data consistency and coherence. 6. Security Policies: Provides the ability to apply security measures to restrict access to services and encrypt data, ensuring confidentiality and integrity. 7. Service Orchestration: Coordinates multiple services into a single aggregate service, enabling complex business processes to be automated and streamlined. 8. Adapters or Connectors: Serve as bridges that facilitate the interaction between middleware services and packaged applications or technologies such as SAP or LDAP. 9. Integration Services: Encapsulate specific business activities into discrete services that can leverage any of the aforementioned middleware services. 10. Service Container: Provides a runtime environment for deploying, scaling, and modifying integration services independently of other services, ensuring flexibility and agility. | Dell Boomi, IBM Integration Bus, MuleSoft – Anypoint Platform, Red Hat Integration (Open Source), Software AG webMethods, TIBCO Software, WS02 (Open Source) |
Data Integration | Data Integration is an integration process that involves the collection, aggregation, transformation, and transportation of large datasets. It encompasses various components, including: 1. Extract Transform Load (ETL) is a component that facilitates the movement of substantial data volumes in batches between different systems. It enables the extraction of data from multiple sources, transforms it into a compatible format, and loads it into the target system. 2. Change Data Capture (CDC) captures and tracks changes made at a data source, ensuring that these modifications are available to other systems in real-time or near real-time. It enables efficient synchronization of data across systems and supports timely updates. 3. Master Data Management (MDM) provides tools and techniques for defining and managing critical data consistently within an organization. It establishes a central repository, acting as a single point of reference for essential data elements. MDM ensures data quality, integrity, and synchronization across various systems, allowing for accurate and reliable information. 4. Secure File Transfer (SFT) is a robust solution that ensures the secure and efficient transfer of files. It encompasses core functionalities aimed at safeguarding files both during transit and at rest, while also offering comprehensive reporting and auditing capabilities to track file transfer activities. With SFT, organizations can have confidence in the confidentiality, integrity, and reliability of their file transfers, ensuring sensitive data remains protected throughout the entire transfer process. | ETL/CDC IBM DataStage, Informatica, SAP Data Services, Talend Data Integration (Open Source MDM Profisee MDM, Stibo Systems STEP, Talend MDM (Open Source, TIBCO EBX SFTP AWS Transfer Family, Axway Secure Transport, IBM Sterling File Gateway |
Event Stream Processing (ESP) | Event Stream Processing (ESP) refers to a computational approach that analyses and processes real-time event data streams to derive insights, detect patterns, and trigger actions. It involves the continuous ingestion, analysis, and interpretation of events as they occur, allowing organizations to respond quickly and intelligently to events of interest. | Kafka, TIBCO Business Events |
Workflow | Workflow is the automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules. Two major components are typically used when implementing workflow these are: Business Rule Management System (BRMS) is a software solution that allows organizations to define, manage, and execute business rules in a centralized and automated manner. It provides a platform for capturing and organizing business rules that govern decision-making and behaviour within an organization\’s processes and systems. Business Process Management (BPM) is an approach that enables the automation and optimization of the sequence of actions or activities involved in completing a business process. It encompasses the efficient handling of documents, information, and tasks as they are passed from one participant to another, adhering to predefined procedural rules. BPM ensures that each step of the process is tracked and can be escalated if not completed within the specified timeframe. By leveraging BPM, organizations can streamline their processes, enhance collaboration, and improve overall operational efficiency. | Red Hat Process Automation Manager (Open Source), jBPM (Open Source), Pegasystems Pega Platform, Camunda BPM |
Artificial Intelligence (AI) | Artificial Intelligence (AI) refers to machines\’ capability to replicate intelligent human behaviour and carry out tasks typically performed by humans. Machine Learning (ML), a subset of AI, enables machines to learn and improve from data without explicit programming. In the context of the integration layer, data can be passed through to ML models for training purposes, allowing the models to identify patterns and make informed decisions based on learned insights. This integration of ML with data flow empowers organizations to leverage intelligent decision-making and enhance overall system performance. | Google TensorFlow (Open Source), Meta PyTorch (Open Source) |
Information Layer
The Information Layer plays a crucial role in an organization’s integration services by offering a consolidated perspective of its information assets and facilitating their utilization. It encompasses various essential components:
- Data Definition and Modelling tools aid in the creation of data models that are utilized in service definitions and data persistence.
- The Common Vocabulary acts as a centralized repository for storing shared business objects, their attributes, and relationships.
It is crucial to understand that this differs from a Common or Canonical Data Model (CDM). Instead of integrating services using a comprehensive Common Data Model that includes all attributes, a modern approach involves defining integration services based on the relevant content within the common vocabulary. This approach empowers domain teams to design services by selectively utilizing attributes from the common vocabulary, resulting in increased efficiency and alignment with specific integration needs. By adopting this approach, organizations can minimize complexity and maintenance costs associated with managing all attributes within a Common Data Model.
By leveraging the Information Layer and its associated tools and repositories, organizations can effectively manage their information assets and streamline the design and development of integration services.
Management and Monitoring Layer
The Management and Monitoring Layer takes charge of overseeing and maintaining the components, encompassing deployment and runtime activities. It includes the monitoring of components to identify deviations from acceptable thresholds, promptly alerting support teams when human intervention is needed.
- Service Management handles the complete lifecycle of integration services, including deployment, versioning, rollback, and starting/stopping operations.
- Metrics Monitoring captures runtime metrics from components and underlying infrastructure to assess availability, performance, integrity, and reliability. In case components operate outside acceptable thresholds, alerts are triggered, which automatically clear once normal operation is restored.
- Logging & Auditing collects and stores logs from various components, enabling alert generation and facilitating root cause analysis. It also records audit data for tracking purposes and compliance with regulatory requirements.
- Error Handling captures and records exceptions raised by components, making attempts to address known errors and promptly alerting operations in unhandled circumstances.
- Job Scheduling governs unattended execution of background programs or batch processing.
Governance Layer
The Governance Layer includes processes and tools to manage artifacts, policies, and the lifecycle of services and APIs. The major components of the governance layer are:
Component | Description | Sample Products |
---|---|---|
API Developer Portal | An API Developer Portal is a web-based platform designed to facilitate seamless collaboration between API providers and developers. It serves as a central hub where developers can discover, explore, and access APIs offered by the organization or a third-party service provider. The main purpose of the API Developer Portal is to enhance the API development experience by providing comprehensive documentation, interactive API testing tools, code samples, SDKs (Software Development Kits), and other resources that enable developers to understand, integrate, and utilize APIs effectively. The portal often includes access controls and authentication mechanisms to regulate API usage and may offer analytics to monitor API performance and usage patterns. | Apigee (Google) API Management Platform, Mulesoft Anypoint Platform, Kong Enterprise, 3scale (Red Hat Integration), WSO2 API Manager (Open Source) |
API and Service Catalogue | An API and Service Catalogue is a structured repository or database that contains detailed information about the various APIs and services available within an organization or across multiple systems. It acts as a centralized inventory of software components, APIs, microservices, or other software artifacts, providing a comprehensive overview of the functionalities, endpoints, data formats, and documentation related to each API or service. The API and Service Catalogue play a crucial role in promoting reusability, standardization, and interoperability within the software ecosystem. It allows developers, architects, and stakeholders to discover, evaluate, and select the most suitable APIs or services for their projects, reducing development time and effort. Additionally, the catalogue may include information on API dependencies, versioning, and deprecation to assist in proper maintenance and governance of the APIs and services throughout their lifecycle. The information contained in the API and Service Catalogue is often made available via the API Developer Portal above. | See API Developer Portal above.
|
Security Layer
The security layer of the Integration platform encompasses several critical features to ensure data protection and access control. The following capabilities are often embedded into multiple components of the Integration platform:
- Authentication & Authorization plays a vital role in verifying user identity and controlling role-based access to platform functions. This includes managing service invocation, viewing service configurations, definitions, and monitoring information.
- Data Security is another essential aspect, ensuring the confidentiality, integrity, and availability of data flowing through the integration platform.
- Confidentiality safeguards data from unauthorized disclosure, while integrity prevents unauthorized modifications. Additionally, availability ensures that services are consistently accessible when needed.
- Non-repudiation provides the ability to prove the origin, time, and parties involved in a specific transaction.
- Secrets Management securely stores digital secrets like usernames, passwords, API keys, and tokens in an encrypted datastore. These secrets can be accessed via command line or API calls during deployment or runtime, allowing applications or scripts to retrieve the necessary secrets.
- Transport Security is employed to establish point-to-point security between service consumers and providers at the transport layer, further enhancing data protection during transmission.
- Policy Enforcement & Monitoring serves as a policy enforcement point, enabling automated monitoring of policy violations, such as tracking the number of requests per minute. This ensures compliance and enhances the overall security of the Integration platform.
Development and Testing Tools
The DevOps layer encompasses all the essential capabilities for modelling, designing, testing, and deploying services effectively.
Component | Description | Sample Tools |
---|---|---|
Integrated Development Environment (IDE) | The Integrated Development Environment (IDE) offers a set of tools and processes that empower developers to design and build integration services throughout the delivery phases. | Visual Code, IntelliJ, Eclipse |
Testing Tools | Testing is facilitated through testing tools that ensure quality assurance and verify that integration services meet predefined requirements. | Postman, SoapUI, Apache JMeter, Pact |
CI/CD | Continuous Integration / Continuous Delivery and Deployment (CI/CD) is an automated method that streamlines the development lifecycle. During CI, code is checked, tested, and merged into source control. CD involves packaging services and deploying them to one or more environments. | Jenkins, Git Lab CI, Travis CI |
Configuration Management and Automation | Configuration management refers to the practice of systematically managing and controlling changes to software, hardware, or infrastructure components in a consistent and controlled manner. The primary objective of configuration management is to maintain the desired state of the system or environment over time, ensuring that it remains stable, reliable, and compliant with predefined standards and requirements. Automation tools are software solutions that streamline and execute repetitive tasks and processes without human intervention. These tools aim to increase efficiency, reduce human errors, and accelerate the delivery of software and services. Automation tools can be applied to various domains, including infrastructure management, software development, testing and deployment. | Git, Terraform, Ansible, Puppet |
Producers Layer
The Producers Layer encompasses diverse data and service providers within an organization. These producers are interconnected with consumers through the Integration Platform.
The following section provides an overview of the layers and essential components comprising the Integration Reference Architecture, collectively forming an Integration Platform. The primary objective of an Integration Platform is to establish an infrastructure that facilitates the seamless connection between data and service producers and the corresponding consumers who require them.
Solution Patterns and Decision Framework
The Integration Reference Architecture needs to be tailored for each organisation in order for it to be relevant and gain adoption. After tailoring the reference architecture the next steps is to develop solution patterns and a decision framework.
- Solution Patterns are reusable solutions to a commonly occurring problems. The Enterprise Integration Patterns (https://www.enterpriseintegrationpatterns.com/) are well known in the integration domain.
- A Decision framework is a framework that guides an architect in selecting the most appropriate solution pattern for the problem at hand.
Read the following article which describes a set of Solution Patterns and Decision Framework that support this Integration Reference Architecture: https://itarchitecturepatterns.net/integration-patterns-and-decision-framework/