{openAudit}

Massive and technical data lineage,
for less and better data !


{openAudit} relies on dynamic data lineage and the identification of data uses, to map information systems and transform them:
simplifications, Cloud migrations.

Persistence

{openAudit} thanks to its incomparable capacity for technical introspection, allows us to achieve this objective: to understand the uses of data and simplify our legacy, to accelerate our Cloud migration"

offers impact analysis + data lineage capabilities with a PL/SQL parser (our flagship techno for our ETL flows), a parser that we have never encountered in other tools. It also allows us to do the cleansing that are required in our code, with excellent results.”

{openAudit}, an easy-to-use tool that provides a quick and clear view of our SAP/Microsoft environment, but also provides a very useful data lineage for impact analysis, which is essential for our compliance topics.”



Gartner






Data catalogue partner:

Dawizz



3 use cases, 6 features,
to transform a system

Use case #1:

Map a system

Teams change, technologies pile up, volumes explode.
{openAudit} is a software that puts an end to complexity: {openAudit} operates an exhaustive and dynamic data lineage on all internal data flows to share with everyone a detailed and objective reading of the information system.

Use case #2:

Optimize a system

To reduce maintenance, to do FinOps, to facilitate technical migrations, to engage in GreenOps, {openAudit} allows massive and iterative simplifications of information systems, by identifying unused data points, replicated, or inoperative elements, on-premise or in the Cloud.

Use case #3:

Migrate to the Cloud

Migration projects are recurrent in companies, whether for tool-to-tool migrations, or to bring complete information systems to the Cloud.
{openAudit} allows these migrations to be carried out quickly and precisely by automating processes, while limiting regressions.

Map an information system


1) oA-Data-lineage-system: data lineage in databases

Starting from any "data point" (a field of a table, a table, a file, a schema), {openAudit} makes it possible to understand its origin and its uses, on-premise and in the Cloud, through technical, multi-technology data lineage. The underlying analyzes are automatically replayed daily.

The information relating to the scheduling of the chains is available, and the uses of each "data point" are defined on hover: who consults what information, when, via which tools, what is of major interest in a compliance framework.

{openAudit} offers different graphical representation modes for its data lineage (business / IT)

Use cases:
Share a detailed understanding of flow construction, identify breaks and correct them, Data Loss Prevention (DLP), BCBS 239 (Basel III), GDPR, etc.

data lineage

Data lineage in databases

A real data lineage in underlying layers, exhaustive and totally automated, which presents multiple views according to needs.

Resolving data lineage breaks

> Views: if they are stored, {openAudit} will read them, even if they are stacked (views of views of views...).

> Dynamic SQL: If {openAudit} fails to resolve it directly, the dynamic SQL is resolved with runtime parameters, or with runtime logs.

> Other: in the event of transfer of information by FTP, or when the database schema is not specified (this is the case in many ELT/ETLs), {openAudit} resolves these breaks by structural recognition , where {openAudit} reads the Batch / Shell.

Dynamicly combine different data transformation technologies

> {openAudit} analyzes all processing technologies (object language/procedural, ELT/ETL), on-premise or Cloud, and combines them in a single data flow, at the level the finer. The drill through provides access to the code.

> The process is dynamic, operated in delta mode, daily, and therefore synchronized with the information system.

Different levels of analysis

> Cloud of points: this view allows to instantly know the uses of a data point by disregarding transformations. It is also possible from a use (a dashboard, data from a dashboard, a query), to instantly identify its operational sources.

> Mapping: this view allows from any data point (field, table) to display a complete mapping of the upstream or downstream flow, i.e. from the operational sources to the exposure of the data (dataviz, query...). The information used is highlighted, and the uses of the information are specified on the flyover (who consults the data, when, how).

> Granular data lineage: this view makes it possible to gradually follow the deployment of data in the information system from a data point by iterative clicks, or conversely to go back to operational sources. Each transformation (ELT/ETL job, procedural code/object) can be analyzed with the “drill through”. The precise details of the uses of the data (who consults it, when, how, etc.) are specified.

Map an information system


2) oA-Data-lineage-viz: data lineage in the reporting layer

{openAudit} makes it possible to understand all of the company's data visualization technologies on a single impact analysis interface: it is a grid that makes it possible to understand the staging between the each of the constituent elements of the dashboard and the physical source field (or the view): from the dashboard cell, to the query that query the database, via the semantic layer if there is one, etc.
Data lineage in the dashboard or in the underlying layers can be triggered from this interface.

Thus, all the internal management rules for data visualization technologies are highlighted and shared with everyone. This data lineage in the data visualization layer can be attached to that of the underlying flows.


Use cases: Shedding light on complex management rules, analyzing the impact between a physical field and dashboard data, etc.

Data lineage in the reporting layer

Some dataviz technologies use semantic layers to create intelligibility for the business, and thus give it autonomy. These semantic layers create abstraction: the underlying physical fields are difficult to identify, which makes sourcing complex.

Furthermore, dataviz technologies often query views, views of views… which again complicates sourcing.

As dataviz technologies multiply, real multi-technology impact analyzes (or sourcing) are complex to operate.

In addition, data visualization technologies now make it possible to do data preparation in large proportions, which creates significant opacity.

Technical answers

> {openAudit} operates a data lineage in the dataviz layer, in the expressions, in the variables, etc., to identify the fields directly or indirectly in source of a data point.

> {openAudit} analyzes the content of views to identify the physical fields that are the source of data for the dataviz layer, even if the views are stacked.

> {openAudit} combines the analyzes of the different dataviz technologies in the same grid, which will allow business and IT to carry out impact analyzes between all the underlying layers and all the data vizualisation tools. Merely.

Optimize a system


1) oA-Optimization-system: detect "dead branches" in feeding chains

Thanks to an analysis of the uses of information, associated with the flow of information (data lineage), {openAudit} identifies useless flows and the associated "data points" (tables / files).

On average, 50% of what is stored in the information system has no added value. These are countless "dead branches" composed of code, tables, views, files, which are wrongly maintained in the systems with considerable impacts: inertia of legacy systems, maintenance costs. And for Cloud systems, unsustainable bills with strong environmental impacts.

Use cases:
Decommissioning of dead branches before migration, rationalization of a system to reduce maintenance, FinOps, GreenOps for a more virtuous IT.

Detect "dead branches" in feeding chains

A large part of the content of information systems has no added value (replicated, obsolete), with significant impacts: maintenance, licenses, technical migrations made impossible, costs, etc.

Technical answers

> {openAudit} analyzes audit database logs and the data vizualisation layer to find out what data is actually being used.

> From fields used in databases to feed data visualization tools, or ad hoc queries (ODBC, JDBC), or from specific ETL/ELT flows, {openAudit} identifies data flows that are in source, i.e. the "living branches" of the information system. In contrast, {openAudit} identifies "dead branches", i.e. tables, procedures, ETL/ELT jobs that build information without ever being used.

> {openAudit} implements these analyzes in a dynamic way, and thus allows, by creating a substantial depth of history, to formally identify the branches which are continuously unused, with all that they concentrate: tables, files, procedures, ELT/ETL jobs. Mass decommissioning can take place in record time.

> In the Cloud, through an analysis of certain logs, {openAudit} identifies the cost of keeping dead branches in the system. The saveable machine resource is also highlighted. The company can enter into a FinOps and GreenOps logic.

Optimize a system


2) oA-Optimization-viz: detect useless, replicated dashboards

The data visualization layer also contains unnecessary complexity. Users naturally tend to copy and paste adding minor nuances.

{openAudit} will analyze the sources, the queries, the expressions, the variables, the data finally displayed, to allow optimizations. Obsolete, broken and replicated dashboards are detected and can be archived or deleted.

We have developed features for SAP BO that allow to perform certain bulk actions in an automated way to find an essentialized platform almost instantly.

Use Cases:
Massive rationalization of the data visualization layer to reduce costs, reduce the risk of errors, improve the intelligibility of the data visualization layer.

Detect useless, replicated dashboards

The data visualization layer is often complex, because we stack technologies, but also we modify management rules, we replicate, we overload formulas, etc. At the very end, it is the very quality of the indicators that suffers, the very objectivity of a dashboard.

Technical answers

> {openAudit} will directly parse the files of the data visualization solution to retrieve the intelligence, the structure of the dashboards and the semantic layer if there is one;

> {openAudit} will also access the repository to keep IDs consistent between the different dashboard objects (semantic layer, query, dashboard, others);

> An {openAudit} probe will retrieve certain logs from audit databases that are associated with data visualization solutions.

From there, {openAudit} will allow:


> To compare dashboards with each other and detect replication on different criteria;

> To detect the obsolescence of dashboards;

> To identify broken formulas;

> To detect unnecessary queries;

This in addition to the impact analysis grids, and the data lineage.

Use case #3 -
Migrate to the Cloud


1) oA-Migration-system: migrate procedural code

Technological migrations of object/procedural languages are often so complex that companies prefer to stack technologies rather than decommission them.
However, they struggle to maintain these languages due to a lack of experts capable of reverse engineering.
Nowadays, the craze for the Cloud is changing this paradigm, and more and more companies are looking to get rid of these legacy languages quickly, with no other solution than to start hazardous and costly migrations.

Use cases:
DBMS change, Cloud migration, maintainability of a legacy system, etc.

code-migration

Migrate procedural code

Large companies have always accumulated processing technologies. There is a continuous piling up, because the removal of a technology often presents too many risks. But the skills associated with it are becoming scarce, and retro-documentation is rarely in place.
At some point, companies have to get started! These can be consultancy projects, long, expensive, and risky. We think it's better to automate the process.

Technical answers

> {openAudit} will "parse" the source code, it will break down all the complexity of the code using a grammar allowing exhaustive and ultra-granular analyses. All subtleties will be taken into consideration,

> {openAudit} deduces the overall kinematics and intelligence, which will be reconstructed in an algorithmic, agnostic tree. On this basis, {openAudit} will produce "standard SQL",

> Then the intelligence will be reconstructed at least in the specific SQL of the target database (e.g. BigQuery for Google, Redshift for Amazon, Azure SQL for Microsoft, etc.),

> All complex processing that cannot be reproduced in simple SQL will be driven by a NodeJS executable. Typically "For Loop" cursors, variables, "If Else" conditional code, "Switches", procedure calls, etc.,

> {openAudit} produces "Yaml" files (intuitive files). Thus, the understanding of complexity is shared with as many people as possible.

> Optionally, new orchestration mechanisms can be implemented, to deconstruct the sliders of sliders (the loops of loops) to optimize the transformation chains.

Use case #2 -
Migrate to the Cloud


2) oA-Migration-viz: migrate dashboards to the Cloud

How to decommission an outdated dataviz technology because it is too static, expensive, incompatible with the target architecture, especially in the Cloud? How can we go serenely towards the tools of tomorrow, also acclaimed by business lines?

{openAudit} allows almost automated migrations between different dataviz technologies, to save infinite time and avoid damaging regressions, and quite simply to enable these projects!

Use cases:
Migrate SAP BO to Looker, or PowerBI, migrate Qlik Sense to Power BI, etc., many scenarios are possible!

dashboard-migration

Migrate dashboards to the Cloud

Most dataviz tools have two things in common: a semantic layer that interfaces between IT and the business, and a dashboard editor.
We rely on the automated reverse engineering of {openAudit} to deconstruct the complexity in source, allowing us to re-address it in the target technology.

Methodology

> {openAudit} will be able to feed the target technology from the single semantic layer, a kind of pivot model. This model will have been generated automatically from the dataviz tools to be decommissioned,

> The structure of the initial dashboard will also have been analyzed by {openAudit}, and it can also be transcribed into the target technology,

> Thus sprawling migration projects, difficult, or impossible to implement can be implemented in record time.

discover some features



News

2022-04-07
Ellipsys was at the FIC 2023 !

{openAudit} at the service of DLP (data loss prevention):
We mentioned the interest of {openAudit} data lineage to understand how sensitive data propagates within information systems. And the analysis of the logs makes it possible to know who accesses it. Thus, it is possible to stem data leaks before the data is disseminated.

FIC

2022-12-01
{openAudit} is part of the UGAP catalog!

{openAudit} by Ellipsys was selected in the UGAP call for tenders.
The use of UGAP exempts French public players from any competition and prior advertising for the acquisition of software from the catalog, including {openAudit}!

FIC

2022-09-27
Conference: transforming your IS thanks to Data Lineage!

The ADEO/Leroy Merlin group led a workshop in front of 150 people at the Big Data Paris 2022 show to explain how it operated the transformation of its Information System (simplification / GCP migration) based on data lineage, and why {openAudit} is essential within a Data Mesh architecture.

BDAIP2022



About Ellipsys:

Ellipsys was founded by Samuel Morin in 2013. The idea is that Information Systems get bigger, more complex and more heterogeneous as technologies accumulate and users multiply. Ellipsys’s promise was to automate the analysis of these IS, empower teams to improve them, make them simpler, easier to migrate... This remains our ambition, and our know-how has strongly developed around data lineage in parlicular, so that we can now tackle many architectures!
The team is made up of several high-level engineers, all keen on research and development ... and customer impact!

Data inventory:

A "data catalog" with files, flows, data sets: all physical data persisted or in memory, views, reports...

Probes and log parsing:

For consumption and injection of data.

Introspection of the dataviz layer:

> Know the link between technical and business informations,
> Gather intelligence (business rules),
> Propagate business terms to the underlying layers to keep a “business reading” of the processes.

Reverse engineering on the code:

For end-to-end technical granular data lineage, synchronized with the IS.

specificities:

> All analyzes are carried out daily, in delta mode, so that openAudit® is permanently synchronized with the IS.
> openAudit®, it also an open databases and web interfaces, on premise or in SaaS.
> We provide APIs or Web Components as needed - to be used at the customer's convenience.

Scheduler parsing:

Understand scheduling to link it to data lineage and data uses.

Contact

* These fields are required.