Custom Metadata Scanner in Informatica Cloud Data Governance and Catalog

1.Introduction:

Custom Scanner:

In Informatica Intelligent Cloud Services (IICS), a “custom scanner” is a tool or component used to extend the capabilities of the platform by integrating with custom or non-standard data sources. Essentially, it allows users to create custom code or configurations to scan and extract metadata from sources that are not natively supported by IICS.

2. Why Custom Scanner?

Custom scanners in IICS are needed when:

  • Non-Standard Data Sources: To integrate with proprietary or unique data sources not supported natively.
  • Complex Metadata: To capture specific or complex metadata that default scanners don’t handle.
  • Legacy Systems: To interface with older or legacy systems lacking modern integration capabilities.
  • Enhanced Data Lineage: To track complex data flows and relationships not covered by default tools.
  • Custom Governance: To meet unique data governance and compliance requirements.
  • New Technologies: To integrate emerging or innovative data technologies into your system.

In summary, custom scanners extend IICS’s capabilities to handle specialized integration and metadata extraction needs.

3. Steps to Follow to create custom metadata in Metadata Command Centre.

1. Open Metadata Command Centre service using IICS.

2.Go to Customize Tab, click Metadata Models  then click Download Template option à Model.

3.Click Model then the sample.json file will be downloaded.

Sample Json file structure:

 4.Format the json file according to your business requirements.  

5.Below are the json file in my project requirement.

Classes:

Associations:

Attributes:

classAttributes:

Package:

  • The package represents a container for the classes in the model.

Classes:

  • A class describes a group of objects that share the same characteristics. Each class has a set of attributes, those that belongs to itself or from its super classes.
  • Ex: Folders, Files, Report, Schema, Table, Column etc

Associations:

  • An association represents the relationship between two objects in the catalog. You can create associations between the objects within the custom catalog source and to objects already synced to the catalog
    • Ex: Project to Folder to File; DB to Schema to Tables to Columns.

Attributes:

  • An attribute describes the characteristics of an object. Define the model attributes and how they apply to classes and relationship by creating attribute Class entries and attribute Relationship entries.
    • Ex: File Type, Folder Created, DataType, Length, PrimaryKey, Catalog Source Name, Catalog Source Type, core_producer, Default, Description, External Link, Name, Nullable, Origin, Reference, Reference ID, Source Created By, Source Created On, Source Modified By, Source Modified On, SourceStatement, Technical  Description, UniqueKey.

Note :

Verify that the package name does not contain com.infa.odin, com.informatica or infa. These keywords are reserved for system models.

6. Validate the created json file in below link to verify the code is valid or not.

https://jsonlint.com/

7. Once the code is valid then upload the json file in metadata models.

8.Click + icon and click the Browse option to select the json file.

9.Package name should be same in json file while create the metadata models.

10.Click the create button to upload the json file in metadata models.

11.Before publishing the package verify the classes, Attributes and Association.

12.Once validation completed then Publish the Package in Metadata Models.

13.Download the Metadata template in zip format.

14.Once zip file is downloaded then unzip the file and fill the Business Requirement information in csv file.

Unsupported Metadata Sample for Custom Model:

report.CustomAnaplandemo.DashboardSchema:

report.CustomAnaplandemo.DataSourceTable:

report.CustomAnaplandemo.TableField:

Links.csv:

Association name:

15.Once the csv files are filled then zip the file with _(underscore) else we will get .(dot) not allowed error.

16.Now we will create the custom source in MCC service.

17.Provide the custom source name as per your wish.

18.Go to NewàCustom SourceàCustom Catalog Sourcesàclick the created custom catalog.

19.Provide the name as per your wish.

20.Upload the Prepared csv file in file details.

21.Go to Configuration tab and give the metadata Change Option is Retain then save and run the job.

22.Once job is completed then navigate to Data Governance and Catalog service.

23.Open Account_custom job in CDGC and explore the Hierarchy tab to know the table and column details.

4. Advantage and disadvantage of custom Scanner

Advantages:

  • Flexibility: Customizes for specific needs.
  • Enhanced Metadata: Captures detailed and complex metadata.
  • Legacy Integration: Connects with older systems.
  • Improved Lineage: Tracks complex data flows.
  • Compliance: Meets specific governance and compliance requirements.
  • Optimized Performance: Can be fine-tuned for efficiency.

Disadvantages:

  • Development Effort: Requires significant time and resources.
  • Complexity: Adds complexity to the environment.
  • Maintenance: Needs ongoing updates and maintenance.
  • Support Challenges: Limited support from IICS.
  • Cost: Can be costly in terms of time and resources.
  • Error Risk: Potential for bugs and errors.

5.Conclusion: Custom scanners offer the flexibility to handle specialized data sources, capture detailed metadata, and integrate with legacy systems, ensuring a comprehensive and effective data management strategy. While the development and maintenance of custom scanners require a thoughtful investment, the ability to extend IICS’s functionality to meet your precise requirements provides substantial benefits. Implementing a custom scanner not only improves data integration but also enhances data governance, lineage tracking, and overall system efficiency, helping you achieve a more streamlined and insightful data environment.

Reference link:

1.Custom Scanner Error Solutions referred by below links.

2..Knowledge Article for the Scanners in Cloud Data Governance and Catalog (CDGC)

3.Custom Metadata Integration in Cloud Data Governance & Catalog | Tech Tuesdays Webinar



Leave a Reply