Skip to content
Analytixus Logo
  • Home
  • Get started
  • Downloads
  • Blog
  • Login
  • Register
  • Questions?

    info@analytixus.de

  • Questions?

    info@analytixus.de

  • Home
  • Get started
  • Downloads
  • Blog
  • Login
  • Register
  • Andy Löwen
  • Sep, Thu, 2025
DnAML , supervised / unsupervised Metadata , XSLT

Enriching Metadata from Code Comments in Analytixus

In modern Data & Analytics projects, metadata is the backbone of automation and governance. Most of the time, we can derive structural metadata directly from the code — for example, by parsing SQL CREATE TABLE statements into XML-based models.

But sometimes, SQL alone isn’t enough. There are attributes we want to capture — like IsBusinessKey, IsTechnical, or History Flags — that cannot be expressed syntactically in the DDL.

This is where code comments come into play. In Analytixus, we use comments as metadata carriers. During the build process, they are extracted, transformed, and converted into actionable metadata.


From Code to Metadata

Here’s an example generated with an XSLT template:

CREATE OR REPLACE TABLE `WWI`.`Store_Countries`
(
    -- Technical Fields
    `DWH_DEF_DATE` TIMESTAMP NOT NULL /*IsTechnical*/
    ,`DWH_LOAD_ID` INT NOT NULL /*IsTechnical*/
    ,`DWH_RECORD_ID` BIGINT GENERATED ALWAYS AS IDENTITY NOT NULL /*IsTechnical*/
    ,`DWH_SOURCE_ID` INT NOT NULL /*IsTechnical*/
    ,`DWH_SOURCETABLE` STRING /*IsTechnical*/

    -- Custom Fields
    , `CountryID` INT NOT NULL
    , `CountryName` STRING NOT NULL
    , `FormalName` STRING NOT NULL
);

Notice the inline comment /*IsTechnical*/.
In the resulting XML metadata, this becomes:

<Column ColumnName="DWH_RECORD_ID"
        DataType="BIGINT"
        IsNullable="0"
        IsTechnical="1"
        IsPK="1"
        IsKey="1" />


Why Comments Matter

Some attributes simply cannot exist in pure SQL syntax:

  • Business Keys (IsBK)
  • Technical Fields (IsTechnical)
  • History Flags (IsHist)
  • Origin References (linking to Bronze/Silver layers)

Embedding these semantics as inline comments makes them machine-readable.

Example:

CREATE TABLE [Silver_Store].[Application_Cities_Key] 
/*IsHist:0; Name:Application_Cities; OriginId:Bronze.Bronze_Access.Application_Cities; HistObjectId:Silver.Silver_Store.Application_Cities_Hist1*/ 
(
    [DWH_Application_CitiesId] INT IDENTITY(2, 1) NOT NULL /*IsTechnical; IsPK*/
    ,[CityID] INT /*IsPK:0; IsBK:1*/
)


From Unsupervised to Supervised Metadata

During the build process, Analytixus first generates unsupervised metadata:

<Object Id="Silver.Silver_Store.Application_Cities_Key"
        ObjectType="Table"
        IsHist="0"
        Name="Application_Cities"
        OriginId="Bronze.Bronze_Access.Application_Cities"
        HistObjectId="Silver.Silver_Store.Application_Cities_Hist1">
  <Column ColumnName="CityID" DataType="INT" IsBK="1" />
   ...
    <Origins>
      <Object Id="Bronze.Bronze_Access.Application_Cities" SchemaName="Bronze_Access" ObjectName="Application_Cities" ViewName="Bronze_Access.Application_Cities" ObjectType="View">
        <Column Id="Bronze.Bronze_Access.Application_Cities.DWH_SOURCE_ID" ColumnName="DWH_SOURCE_ID" Ordinal="1" DataType="INT" IsNullable="0" IsPK="0" IsBK="0" IsKey="0" BasisDataType="INTEGER" OriginAliasName="Stage" OriginColumnName="DWH_SOURCE_ID" OriginId="Bronze.Bronze_Store.Application_Cities.DWH_SOURCE_ID" IsTechnical="1" />
   ...
      </Object>
   </Origins>
    <Hist>
      <Object Id="Silver.Silver_Store.Application_Cities_Hist1" SchemaName="Silver_Store" ObjectName="Application_Cities_Hist1" TableName="Silver_Store.Application_Cities_Hist1" ObjectType="Table" IsHist="1" Name="Application_Cities" OriginId="Bronze.Bronze_Access.Application_Cities" KeyObjectId="Silver.Silver_Store.Application_Cities_Key">
        <Column Id="Silver.Silver_Store.Application_Cities_Hist1.DWH_Application_CitiesId" ColumnName="DWH_Application_CitiesId" Ordinal="1" DataType="INT" IsNullable="1" IsPK="0" IsBK="0" IsKey="0" BasisDataType="INTEGER" IsTechnical="1" />
        ...
      </Object>
    </Hist>
</Object>

Then, attributes like HistObjectId and OriginId are resolved, and the metadata is enriched into supervised metadata:

  • History objects are linked.
  • Origins are connected to their Bronze counterparts.
  • Relationships and constraints are automatically added.

This metadata can now be consumed by further XSLT templates and DNA-ML models.


Benefits of Using Comments as Metadata

  • ✅ Flexibility: Developers influence metadata directly in the code.
  • ✅ Consistency: Attributes are extracted systematically.
  • ✅ Traceability: Origins and history objects provide lineage across layers.
  • ✅ Automation: Downstream templates can act on enriched metadata.

Conclusion

By using code comments as metadata carriers, we bridge the gap between SQL definitions and rich semantic models.

In Analytixus, comments are not just documentation — they are drivers of automation. They ensure that attributes like IsBK, IsTechnical, and IsHist become first-class citizens in the metadata layer.

This approach transforms unsupervised metadata into supervised metadata, enabling automation, lineage, and governance without slowing down development.

Search
Archives
  • November 2025
  • October 2025
  • September 2025
Analytixus

Accelerating data solutions with a metadata-first approach.

Quick Links
  • Blog
  • Privacy Policy
  • Downloads
  • Legal Notice
  • Get started
Copyright © 2025 - Andy Löwen - Analytixus - The Analytics-DRUID