Mapping Suite SDK

Introduction

The Mapping Suite SDK is a powerful software development kit designed to standardise and simplify the handling of mapping packages containing XML-to-RML (RDF Mapping Language) transformation rules and related components.

Key Features

Flexible Package Management

  • Load mapping packages from multiple sources

    • Local folders

    • ZIP archives

    • GitHub repositories

    • MongoDB databases

Comprehensive Serialisation

  • Convert mapping packages between file-based and object representations

  • Support for complex mapping package structures

  • Robust error handling and validation

Command Line Interface

  • Powerful CLI for common operations

  • Validation of mapping packages from archives, folders, and GitHub repositories

  • Options for including test data and output in validation

  • Verbose mode for detailed debugging information

Advanced Extraction

  • Flexible extractors for different package sources

  • Temporary and permanent extraction modes

  • Pattern-based package discovery

MongoDB Integration

  • Type-safe repository operations

  • CRUD functionality for mapping packages

  • Advanced querying capabilities

Observability

  • Built-in OpenTelemetry tracing

  • Performance monitoring

  • Detailed error tracking

In future

  • Validation service

  • Transformation capabilities

Quick Start

Installation

Using pip

pip install mapping-suite-sdk

Using Poetry

poetry add mapping-suite-sdk

Basic Usage

from pathlib import Path
import mapping_suite_sdk as mssdk

# Load a mapping package
package = mssdk.load_mapping_package_from_folder(
    mapping_package_folder_path=Path("/path/to/mapping/package")
)

# Serialise the package
mssdk.serialise_mapping_package(
    mapping_package=package,
    serialisation_folder_path=Path("/output/path")
)

Core Components

Adapters

  • Extractor: Handle package extraction from various sources

  • Loader: Load different components of mapping packages

  • Repository: Manage package storage and retrieval

  • Serialiser: Convert packages between formats

  • Tracer: Provide observability and performance monitoring

Services

  • Package loading from multiple sources

  • Package serialisation

  • Tracing and monitoring

Supported Mapping Package Components

  • Metadata

  • Conceptual mappings

  • Technical mapping suites

  • Vocabulary mapping suites

  • Test data suites

  • SPARQL validation suites

  • SHACL validation suites

Advanced Features

Custom Extraction

from mapping_suite_sdk.adapters.extractor import MappingPackageExtractorABC

class CustomPackageExtractor(MappingPackageExtractorABC):
    def extract(self, source, destination, **kwargs):
        # Implement custom extraction logic
        pass

    def extract_temporary(self, source, **kwargs):
        # Implement temporary extraction
        pass

Tracing and Monitoring

import mapping_suite_sdk as mssdk
from opentelemetry.sdk.trace.export import ConsoleSpanExporter

# Enable tracing
mssdk.set_mssdk_tracing(True)

# Add console exporter for trace details
console_exporter = ConsoleSpanExporter()
mssdk.add_span_processor_to_mssdk_tracer_provider(console_exporter)

Performance and Scalability

  • Efficient package handling

  • Minimal resource overhead

  • Supports large mapping packages

  • Configurable tracing and monitoring

Contributing

Contributions are welcome! Please follow our contribution guidelines:

  1. Fork the repository

  2. Create your feature branch

  3. Commit your changes

  4. Push to the branch

  5. Create a Pull Request

License

Apache License 2.0