Discovery Engine Open Metadata Access Service (OMAS)

The Discovery Engine OMAS provides APIs and events for metadata discovery tools that are surveying the data landscape and recording information in metadata repositories.

These types of tools are called Discovery Engines in the Open Discovery Framework (ODF), which is why this access service is called the Discovery Engine OMAS.

The Open Discovery Framework (ODF) provides a comprehensive set of open APIs that describe the interaction between metadata discovery tools and a metadata server. The aim is to make it easy for metadata discovery tools to work with open metadata repositories.

The capabilities defined in the ODF fall into 4 broad categories.

Figure 1 shows how these capabilities work together.

Figure 1

Figure 1: Interfaces of the Discovery Engine OMAS

  1. The engine host server retrieves configuration from the Governance Engine OMAS.
  2. When a discovery engine receives a request to analyse an asset, it retrieves the annotations from previous analysis of this asset.
  3. While the discovery service is running, it is writing new annotations about the asset through the Discovery Engine OMAS.

More details of this processing follows.

Discovery Engine Configuration

The configuration of the discovery engines and the discovery services that they support are managed in the metadata server through the Governance Engine OMAS.

The Engine Host OMAG Server is typically located close to the data assets to minimize the network traffic resulting from the analysis. Where the data assets are distributed in multiple locations, it is possible to deploy an Engine Host server in each location so the discovery workload is kept close to the data.

A single Discovery Engine OMAS can support multiple engine hosts deployed in this way.

The Asset Analysis OMES on the engine host server is configured with the location of the metadata server where the Discovery Engine OMAS is running along with the names of the discovery engines it will host. The same discovery engine can simultaneously run on multiple engine host servers. This means the Asset Analysis OMES can host all of the discovery engines it needs to analyse the assets at its location.

When the Asset Analysis OMES starts in the engine host, it calls the Governance Engine OMAS to retrieve the configuration for each of its discovery engines (see Figure 1, number 1). It also connects to the Governance Engine OMAS’s out topic to receive any updates on this configuration while it is running.

Within the discovery engine’s configuration are the list of discovery request types it supports that are in turn each linked to the discovery service that should run when one of these discovery types is requested to be run against a specific asset. This is shown in figure 2.

Figure 2

Figure 2: Discovery Engine Configuration

Processing Discovery Requests

When a discovery request is made, the discovery engine creates an instance of the discovery service and gives it access to a discovery context. The discovery context provides access to existing metadata known about the Asset, a connector to access the data stored in the asset and a store to record the new metadata it has discovered about the asset. Behind the scenes, the discovery context is calling the Discovery Engine OMAS to both retrieve metadata about the Asset and its connector (see Figure 1, number 2), and to store the new metadata (Figure 1, number 3).

Further Information

The Open Discovery Framework (ODF) provides more information about the discovery engines and discovery services along with the metadata APIs.

In Egeria, both the metadata server where the Discovery Engine OMAS runs and the engine host whether the Asset Analysis OMES runs are types of OMAG Servers. More information on the operation of the engine host can be found under the Engine Services.

An overview of automated metadata discovery approaches is available here.

Design information

The module structure for the Discovery Engine OMAS is as follows:

Return to the access-services module.

License: CC BY 4.0, Copyright Contributors to the ODPi Egeria project.