Language Binding Guidelines

Version: 1.0 Last Updated: 2025-11-06 Status: Active Development Guidelines

Overview
Core Principles
Required API Surface
FFI Boundary Design
Testing Requirements
Security Requirements
Documentation Requirements
CI/CD Integration
Language-Specific Patterns
Reference Implementations

Overview

This document provides comprehensive guidelines for creating new language bindings for the ASTERIX decoder. All bindings share the same C++ core parser but expose idiomatic APIs for their target languages.

Current Bindings:

C++ (native) - High-performance CLI
Python 3.10-3.14 - Simple, intuitive API
Rust 1.87+ - Type-safe, memory-safe bindings

Planned Bindings:

Node.js (N-API/NAPI-RS)
Go (CGO)
Java (JNI)
C# (.NET Interop)

Core Principles

1. Safety First

All language bindings MUST implement a validated FFI boundary layer:

┌─────────────────────────────────────┐
│   Idiomatic Language API            │  ← User-facing, language-specific
│   (ergonomic, safe, convenient)     │
└──────────────┬──────────────────────┘
               │
┌──────────────┴──────────────────────┐
│   FFI Boundary Layer                │  ← VALIDATION LAYER (CRITICAL)
│   - Input validation                │
│   - Bounds checking                 │
│   - Type conversion                 │
│   - Error mapping                   │
└──────────────┬──────────────────────┘
               │
┌──────────────┴──────────────────────┐
│   C++ Core Parser                   │  ← Shared parsing engine
│   (libasterix)                      │
└─────────────────────────────────────┘

Key Safety Requirements:

Validate ALL inputs at FFI boundary
Check buffer bounds before passing to C++
Prevent integer overflows in size/offset calculations
Map C++ errors to language-specific exceptions
No silent failures - all errors propagated

2. Zero-Copy Where Possible

Minimize data copying across FFI boundaries:

Use buffer views/slices for input data (don’t copy unless necessary)
Return owned data structures (let language runtime manage lifetime)
Avoid double-serialization (e.g., C++ → JSON → language struct)

Example (Rust - zero-copy input):

pub fn parse(data: &[u8]) -> Result<Vec<AsterixRecord>> {
    // Pass slice pointer directly to C++ (zero-copy input)
    unsafe { ffi::asterix_parse(data.as_ptr(), data.len()) }
    // Return owned Vec (Rust manages memory)
}

3. Idiomatic APIs

Each binding should feel natural to developers in that language:

Language	Idiom	Example
Python	Dicts, keyword args	`parse(data, verbose=True)`
Rust	Result types, builders	`Parser::new().build()?`
JavaScript	Promises, callbacks	`await parse(buffer)`
Go	Error values, channels	`records, err := Parse(data)`
Java	Exceptions, streams	`parser.parse(bytes).stream()`

4. Comprehensive Error Handling

Errors should provide actionable information:

# BAD: Generic error
raise RuntimeError("Parse failed")

# GOOD: Detailed error with context
raise AsterixParseError(
    offset=1024,
    category=48,
    message="Invalid FSPEC: unexpected FX bit",
    raw_data=hexdump(data[1024:1040])
)

Required API Surface

All language bindings MUST provide these core APIs:

1. Initialization

Purpose: Load ASTERIX category definitions

init_default()                    - Load all default categories
init_from_path(config_dir)        - Load from custom directory
load_category(xml_file)           - Load single category file
is_category_defined(category)     - Check if category is loaded

Python Example:

import asterix

# Auto-initializes on import
asterix.init("custom/config/path")
asterix.load_category("asterix_cat099.xml")

Rust Example:

use asterix::Parser;

let parser = Parser::new()
    .init_default()?
    .add_category("asterix_cat099.xml")?
    .build()?;

2. Parsing

Purpose: Parse ASTERIX data from bytes

parse(data: bytes) -> Vec<Record>
parse_with_offset(data, offset, count) -> ParseResult

Return Type:

Record {
    category: u8,
    length: u32,
    timestamp_ms: u64,
    crc: u32,
    hex_data: String,
    items: Map<String, DataItem>
}

ParseResult {
    records: Vec<Record>,
    bytes_consumed: usize,
    remaining_blocks: usize
}

Python Example:

data = open("sample.pcap", "rb").read()
records = asterix.parse(data)

for record in records:
    print(f"Category {record['category']}: {len(record['items'])} items")

Rust Example:

let data = std::fs::read("sample.pcap")?;
let options = ParseOptions::default();
let records = parser.parse(&data, &options)?;

for record in records {
    println!("Category {}: {} items", record.category, record.items.len());
}

3. Metadata/Description

Purpose: Get human-readable descriptions

describe(category, item, field, value) -> String

Example:

# Get category description
desc = asterix.describe(48, None, None, None)
# "Monoradar Target Reports"

# Get item description
desc = asterix.describe(48, "010", None, None)
# "Data Source Identifier"

# Get field value description
desc = asterix.describe(48, "010", "SAC", "7")
# "System Area Code: 7"

4. Configuration

Purpose: Control parser behavior

set_verbose(enabled: bool)
set_log_level(level: LogLevel)
get_log_level() -> LogLevel

Rust Example:

use asterix::{set_log_level, LogLevel};

set_log_level(LogLevel::Debug);  // Enable verbose output
let records = parser.parse(&data, &options)?;
set_log_level(LogLevel::Silent); // Silence output

FFI Boundary Design

Input Validation Checklist

ALL FFI functions MUST validate inputs:

extern "C" int asterix_parse_safe(
    const uint8_t* data,
    size_t data_len,
    AsterixRecord** out_records,
    size_t* out_count,
    char* error_buffer,
    size_t error_buffer_size)
{
    // 1. NULL POINTER CHECKS
    if (!data || !out_records || !out_count) {
        snprintf(error_buffer, error_buffer_size, "Null pointer argument");
        return ERROR_INVALID_ARGUMENT;
    }

    // 2. EMPTY INPUT CHECK
    if (data_len == 0) {
        snprintf(error_buffer, error_buffer_size, "Empty input data");
        return ERROR_EMPTY_INPUT;
    }

    // 3. SIZE LIMITS
    if (data_len > MAX_MESSAGE_SIZE) {
        snprintf(error_buffer, error_buffer_size,
            "Input too large: %zu bytes (max %zu)", data_len, MAX_MESSAGE_SIZE);
        return ERROR_INPUT_TOO_LARGE;
    }

    // 4. CALL CORE PARSER (after validation)
    return asterix_parse_internal(data, data_len, out_records, out_count);
}

Error Code Mapping

Map C++ error codes to language-specific exceptions:

C++ Error Codes:

enum AsterixError {
    OK = 0,
    ERROR_EMPTY_INPUT = 1,
    ERROR_INVALID_CATEGORY = 2,
    ERROR_TRUNCATED_MESSAGE = 3,
    ERROR_BUFFER_OVERFLOW = 4,
    ERROR_INVALID_FSPEC = 5,
    ERROR_MALFORMED_DATA = 6,
    ERROR_OUT_OF_MEMORY = 7,
    ERROR_INVALID_ARGUMENT = 8,
};

Python Mapping:

class AsterixError(Exception):
    """Base exception for ASTERIX decoder"""
    pass

class ParseError(AsterixError):
    """Parse-time error with offset context"""
    def __init__(self, offset, message):
        self.offset = offset
        self.message = message
        super().__init__(f"Parse error at offset {offset}: {message}")

class InvalidDataError(AsterixError):
    """Invalid input data"""
    pass

# Map C++ error codes to Python exceptions
ERROR_MAP = {
    1: InvalidDataError("Empty input data"),
    2: InvalidDataError("Invalid ASTERIX category"),
    3: ParseError,
    4: ParseError,
    # ...
}

Rust Mapping:

#[derive(Debug, Clone, PartialEq, Eq)]
pub enum AsterixError {
    EmptyInput,
    InvalidCategory(u8),
    ParseError { offset: usize, message: String },
    BufferOverflow { offset: usize },
    InvalidFspec { offset: usize },
    MalformedData(String),
    OutOfMemory,
    InvalidArgument(String),
    IOError(String),
}

impl std::error::Error for AsterixError {}

impl std::fmt::Display for AsterixError {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        match self {
            AsterixError::EmptyInput => write!(f, "Empty input data"),
            AsterixError::InvalidCategory(cat) => write!(f, "Invalid category: {}", cat),
            AsterixError::ParseError { offset, message } =>
                write!(f, "Parse error at offset {}: {}", offset, message),
            // ...
        }
    }
}

Memory Management

Ownership Rules:

Input buffers: Borrowed (FFI receives pointer + length)
Output data: Owned by language runtime (C++ allocates, language frees)
Error strings: C++ allocates with malloc(), language calls free()

Python Example (C Extension):

static PyObject* asterix_parse(PyObject* self, PyObject* args) {
    const char* buffer;
    Py_ssize_t length;

    // Parse args (borrows buffer, does NOT copy)
    if (!PyArg_ParseTuple(args, "y#", &buffer, &length)) {
        return NULL;
    }

    // Validate
    if (length == 0) {
        PyErr_SetString(PyExc_ValueError, "Empty input data");
        return NULL;
    }

    // Call C++ (buffer is borrowed)
    AsterixRecord* records = NULL;
    size_t count = 0;
    int result = asterix_parse_safe(
        (const uint8_t*)buffer, length, &records, &count, error_buf, 1024);

    if (result != 0) {
        PyErr_SetString(PyExc_RuntimeError, error_buf);
        return NULL;
    }

    // Convert to Python objects (Python now owns the data)
    PyObject* list = PyList_New(count);
    for (size_t i = 0; i < count; i++) {
        PyObject* record = build_record_object(&records[i]);
        PyList_SET_ITEM(list, i, record);
    }

    // Free C++ allocated memory
    asterix_free_records(records, count);

    return list;
}

Rust Example (CXX Bridge):

#[cxx::bridge]
mod ffi {
    unsafe extern "C++" {
        fn asterix_parse(data: *const u8, len: usize) -> *mut AsterixDataWrapper;
        fn asterix_free_data(ptr: *mut AsterixDataWrapper);
    }
}

pub fn parse(data: &[u8]) -> Result<Vec<AsterixRecord>> {
    unsafe {
        // Borrow input (zero-copy)
        let data_ptr = ffi::asterix_parse(data.as_ptr(), data.len());

        if data_ptr.is_null() {
            return Err(AsterixError::NullPointer);
        }

        // Convert to Rust types (copies data, Rust now owns)
        let records = convert_asterix_data(data_ptr)?;

        // Free C++ memory
        ffi::asterix_free_data(data_ptr);

        Ok(records)
    }
}

Testing Requirements

All language bindings MUST meet these testing requirements:

1. Test Coverage: >80%

Metrics:

Line coverage: >80%
Branch coverage: >75%
Function coverage: >90%

Tools:

Python: coverage.py, pytest-cov
Rust: cargo-tarpaulin, cargo-llvm-cov
Node.js: nyc, jest --coverage
Go: go test -cover

Example (Python):

pytest --cov=asterix --cov-report=html --cov-report=term-missing
# Fails CI if coverage < 80%
coverage report --fail-under=80

2. Unit Tests

Test individual functions in isolation:

# tests/test_ffi_validation.py
import pytest
import asterix

def test_parse_empty_input():
    """Empty input should raise ValueError"""
    with pytest.raises(ValueError, match="Empty input"):
        asterix.parse(b"")

def test_parse_invalid_category():
    """Invalid category should raise ValueError"""
    with pytest.raises(ValueError, match="Invalid category"):
        asterix.describe(0, None, None, None)  # Category 0 is invalid

def test_parse_buffer_overflow():
    """Offset beyond buffer should raise ParseError"""
    data = b"\x30\x00\x10" * 10  # 30 bytes
    with pytest.raises(asterix.ParseError):
        asterix.parse_with_offset(data, offset=50, count=1)

3. Integration Tests

Test end-to-end parsing with real ASTERIX data:

# tests/test_integration.py
import asterix
import os

def test_parse_cat048_pcap():
    """Parse real CAT048 PCAP file"""
    test_file = "tests/data/cat048_sample.pcap"
    data = open(test_file, "rb").read()

    records = asterix.parse(data)

    assert len(records) > 0
    assert all(r['category'] == 48 for r in records)
    assert all('items' in r for r in records)

def test_incremental_parsing():
    """Parse large file incrementally"""
    test_file = "tests/data/large_file.asterix"
    data = open(test_file, "rb").read()

    all_records = []
    offset = 0

    while offset < len(data):
        result = asterix.parse_with_offset(data, offset, count=100)
        all_records.extend(result['records'])
        offset = result['bytes_consumed']

        if result['remaining_blocks'] == 0:
            break

    assert len(all_records) > 0

4. Property-Based Tests

Use fuzzing/property testing to discover edge cases:

Python (Hypothesis):

from hypothesis import given, strategies as st
import asterix

@given(st.binary(min_size=1, max_size=1024))
def test_parse_never_crashes(data):
    """Parser should never crash, even on random input"""
    try:
        records = asterix.parse(data)
        # If parse succeeds, records should be valid
        assert isinstance(records, list)
    except (ValueError, RuntimeError):
        # Expected errors are OK
        pass

Rust (proptest):

use proptest::prelude::*;

proptest! {
    #[test]
    fn parse_never_panics(data: Vec<u8>) {
        // Parser should never panic
        let _ = parse(&data, ParseOptions::default());
    }

    #[test]
    fn offset_bounds_checked(data: Vec<u8>, offset in 0usize..10000) {
        // Out-of-bounds offset should return error, not panic
        if offset >= data.len() {
            assert!(parse_with_offset(&data, offset, 1, ParseOptions::default()).is_err());
        }
    }
}

5. Memory Leak Tests

Verify no memory leaks with repeated parsing:

Python:

import gc
import tracemalloc

def test_no_memory_leaks():
    """Repeated parsing should not leak memory"""
    data = open("tests/data/sample.pcap", "rb").read()

    tracemalloc.start()
    baseline = tracemalloc.get_traced_memory()[0]

    # Parse 1000 times
    for _ in range(1000):
        records = asterix.parse(data)
        del records
        gc.collect()

    current = tracemalloc.get_traced_memory()[0]
    tracemalloc.stop()

    # Memory should not grow significantly (allow 10% variance)
    assert current < baseline * 1.1

Rust:

# Use Valgrind/ASAN for leak detection
RUSTFLAGS="-Z sanitizer=address" cargo test
valgrind --leak-check=full cargo test

Security Requirements

All language bindings MUST pass security audits:

1. Static Analysis

Required Tools:

Language	Tool	Purpose
Python	Bandit	Security linter
Python	Safety	Dependency vulnerability scanner
Rust	Clippy	Linter with security checks
Rust	cargo-audit	Dependency vulnerability scanner
Node.js	ESLint (security plugin)	Security linter
C++	CodeQL	Semantic code analysis

Python Example:

# Run Bandit (CI fails on HIGH severity issues)
bandit -r asterix/ -f json -o bandit-report.json

# Check dependencies for known vulnerabilities
safety check --json

Rust Example:

# Run Clippy with strict lints
cargo clippy --all-targets --all-features -- -D warnings

# Audit dependencies
cargo audit

2. Fuzzing

Integrate fuzzing for FFI boundary validation:

Python (Atheris - libFuzzer for Python):

import atheris
import sys
import asterix

@atheris.instrument_func
def TestOneInput(data):
    """Fuzz parser with random input"""
    try:
        asterix.parse(bytes(data))
    except (ValueError, RuntimeError):
        pass  # Expected errors

atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()

Rust (cargo-fuzz):

#![no_main]
use libfuzzer_sys::fuzz_target;
use asterix::{parse, ParseOptions};

fuzz_target!(|data: &[u8]| {
    let _ = parse(data, ParseOptions::default());
});

Run Fuzzing (CI nightly):

# Python
python -m atheris.fuzz_harness fuzz_parse.py

# Rust
cargo fuzz run fuzz_parse -- -max_total_time=3600

3. ASAN/MSAN/UBSAN

Run tests with sanitizers enabled:

Rust:

# Address Sanitizer (memory safety)
RUSTFLAGS="-Z sanitizer=address" cargo test

# Memory Sanitizer (uninitialized reads)
RUSTFLAGS="-Z sanitizer=memory" cargo test

# Undefined Behavior Sanitizer
RUSTFLAGS="-Z sanitizer=undefined" cargo test

Python (C extension):

# Compile with ASAN
CC="clang -fsanitize=address" python setup.py build_ext --inplace

# Run tests with ASAN
ASAN_OPTIONS=detect_leaks=1 python -m pytest

4. CodeQL Analysis

Integrate GitHub CodeQL for semantic analysis:

.github/workflows/codeql.yml:

name: CodeQL Security Analysis

on:
  push:
    branches: [master]
  pull_request:
    branches: [master]

jobs:
  analyze:
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v3

    - name: Initialize CodeQL
      uses: github/codeql-action/init@v2
      with:
        languages: python, cpp

    - name: Build
      run: python setup.py build_ext --inplace

    - name: Perform CodeQL Analysis
      uses: github/codeql-action/analyze@v2

Documentation Requirements

All language bindings MUST provide comprehensive documentation:

1. API Reference

Auto-generated from docstrings/comments:

Python (Sphinx):

def parse(data: bytes, verbose: bool = False) -> List[Dict]:
    """Parse ASTERIX data from bytes.

    Args:
        data: Raw ASTERIX bytes (raw, PCAP, HDLC, etc.)
        verbose: Enable verbose output (default: False)

    Returns:
        List of parsed ASTERIX records, each record is a dict with:
        - category (int): ASTERIX category (1-255)
        - length (int): Data block length in bytes
        - timestamp_ms (int): Timestamp in milliseconds
        - items (dict): Parsed data items

    Raises:
        ValueError: If input data is empty or invalid
        RuntimeError: If parser encounters an error

    Example:
        >>> data = open("sample.pcap", "rb").read()
        >>> records = asterix.parse(data)
        >>> print(records[0]['category'])
        48
    """

Rust (rustdoc):

/// Parse ASTERIX data from bytes.
///
/// # Arguments
///
/// * `data` - Raw ASTERIX bytes (can be raw binary, PCAP, HDLC, etc.)
/// * `options` - Parsing configuration options
///
/// # Returns
///
/// A vector of parsed ASTERIX records on success.
///
/// # Errors
///
/// Returns an error if:
/// - Input data is empty ([`AsterixError::EmptyInput`])
/// - Requested category is not defined ([`AsterixError::InvalidCategory`])
/// - Parser encounters malformed data ([`AsterixError::ParseError`])
///
/// # Example
///
/// ```no_run
/// use asterix::{parse, ParseOptions};
///
/// let data = std::fs::read("sample.pcap")?;
/// let options = ParseOptions::default();
/// let records = parse(&data, options)?;
///
/// for record in records {
///     println!("Category {}: {} items", record.category, record.items.len());
/// }
/// # Ok::<(), Box<dyn std::error::Error>>(())
/// ```
pub fn parse(data: &[u8], options: ParseOptions) -> Result<Vec<AsterixRecord>> {
    // ...
}

2. Tutorials

Quickstart guide for each binding:

# Python Quickstart

## Installation

```bash
pip install asterix_decoder

Basic Usage

import asterix

# Parse ASTERIX data
data = open("sample.pcap", "rb").read()
records = asterix.parse(data)

# Print parsed records
for record in records:
    print(f"Category {record['category']}: {len(record['items'])} items")
    for item_name, item_data in record['items'].items():
        print(f"  {item_name}: {item_data}")

Advanced Usage

Incremental Parsing

# Parse large files incrementally
offset = 0
while offset < len(data):
    result = asterix.parse_with_offset(data, offset, count=100)
    for record in result['records']:
        process(record)
    offset = result['bytes_consumed']

Custom Categories

# Load custom category definition
asterix.load_category("custom_category.xml")

### 3. Migration Guide

**From other bindings:**

```markdown
# Migrating from Python to Rust

## Python
```python
import asterix
data = open("file.pcap", "rb").read()
records = asterix.parse(data)

Rust Equivalent

use asterix::{Parser, ParseOptions};

let data = std::fs::read("file.pcap")?;
let parser = Parser::new().init_default()?.build()?;
let options = ParseOptions::default();
let records = parser.parse(&data, &options)?;

Key Differences

---

## CI/CD Integration

All language bindings MUST integrate with CI/CD:

### 1. Build Matrix

Test on multiple OS/compiler/runtime versions:

**.github/workflows/binding-test.yml:**
```yaml
name: Language Binding Tests

on: [push, pull_request]

jobs:
  python:
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
        python-version: ['3.10', '3.11', '3.12', '3.13', '3.14']

    runs-on: $

    steps:
    - uses: actions/checkout@v3

    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: $

    - name: Install dependencies
      run: |
        pip install pytest pytest-cov hypothesis

    - name: Build extension
      run: python setup.py build_ext --inplace

    - name: Run tests
      run: pytest --cov=asterix --cov-report=xml

    - name: Upload coverage
      uses: codecov/codecov-action@v3

  rust:
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
        rust: [stable, beta, nightly]

    runs-on: $

    steps:
    - uses: actions/checkout@v3

    - name: Install Rust
      uses: actions-rs/toolchain@v1
      with:
        toolchain: $

    - name: Build
      run: cargo build --all-features

    - name: Run tests
      run: cargo test --all-features

    - name: Run clippy
      run: cargo clippy -- -D warnings

2. Release Automation

Automate package publishing:

Python (PyPI):

- name: Build wheels
  run: python -m build

- name: Publish to PyPI
  uses: pypa/gh-action-pypi-publish@release/v1
  with:
    password: $

Rust (crates.io):

- name: Publish to crates.io
  run: cargo publish --token $

3. Performance Regression Detection

Track performance over time:

- name: Run benchmarks
  run: |
    cargo bench --bench benchmarks -- --save-baseline master

- name: Compare with baseline
  run: |
    cargo bench --bench benchmarks -- --baseline master

Language-Specific Patterns

Python (C Extension)

Structure:

asterix/
├── __init__.py           # User-facing API
├── _asterix.so           # C extension (built from src/python/)
└── py.typed              # PEP 561 type stub marker

src/python/
├── asterix_wrapper.c     # Python C API wrapper
├── python_parser.cpp     # C++ bridge
└── python_parser.h       # Bridge header

Type Stubs (PEP 484):

# asterix/__init__.pyi
from typing import Dict, List, Optional

def parse(data: bytes, verbose: bool = False) -> List[Dict]: ...
def parse_with_offset(
    data: bytes,
    offset: int,
    count: int,
    verbose: bool = False
) -> Dict: ...
def describe(
    category: int,
    item: Optional[str] = None,
    field: Optional[str] = None,
    value: Optional[str] = None
) -> str: ...

Rust (CXX Bridge)

Structure:

asterix-rs/
├── Cargo.toml
├── build.rs              # Build script (compiles C++ via CMake)
├── src/
│   ├── lib.rs           # Public API
│   ├── ffi.rs           # CXX bridge
│   ├── parser.rs        # Safe parser wrapper
│   ├── types.rs         # Rust data types
│   └── error.rs         # Error types
├── tests/               # Integration tests
├── examples/            # Example programs
└── benches/             # Benchmarks

CXX Bridge Pattern:

#[cxx::bridge(namespace = "asterix")]
mod ffi {
    unsafe extern "C++" {
        include!("ffi_wrapper.h");

        type AsterixDataWrapper;

        fn asterix_init(config_dir: &str) -> bool;
        fn asterix_parse(data: *const u8, len: usize) -> *mut AsterixDataWrapper;
        fn asterix_free_data(ptr: *mut AsterixDataWrapper);
    }
}

Node.js (N-API)

Structure:

asterix-node/
├── package.json
├── binding.gyp          # Native addon build config
├── lib/
│   └── index.js        # JavaScript API
├── src/
│   ├── addon.cpp       # N-API wrapper
│   └── parser_wrapper.cpp
└── test/
    └── test.js

N-API Pattern:

Napi::Value Parse(const Napi::CallbackInfo& info) {
    Napi::Env env = info.Env();

    // Validate arguments
    if (info.Length() < 1 || !info[0].IsBuffer()) {
        Napi::TypeError::New(env, "Expected Buffer").ThrowAsJavaScriptException();
        return env.Null();
    }

    // Get buffer
    Napi::Buffer<uint8_t> buffer = info[0].As<Napi::Buffer<uint8_t>>();
    const uint8_t* data = buffer.Data();
    size_t length = buffer.Length();

    // Call C++ parser
    AsterixRecord* records = NULL;
    size_t count = 0;
    int result = asterix_parse_safe(data, length, &records, &count, error_buf, 1024);

    if (result != 0) {
        Napi::Error::New(env, error_buf).ThrowAsJavaScriptException();
        return env.Null();
    }

    // Convert to JavaScript objects
    Napi::Array array = Napi::Array::New(env, count);
    for (size_t i = 0; i < count; i++) {
        array[i] = ConvertRecord(env, &records[i]);
    }

    asterix_free_records(records, count);
    return array;
}

Napi::Object Init(Napi::Env env, Napi::Object exports) {
    exports.Set("parse", Napi::Function::New(env, Parse));
    return exports;
}

NODE_API_MODULE(asterix, Init)

Reference Implementations

Python Binding

File: src/python/asterix_wrapper.c

Key Features:

Comprehensive input validation (Issues #29, CRITICAL-002)
Proper Python exception mapping
Memory safety (refcounting)
Integration with C++ core via python_parser.cpp

Example Validation:

static PyObject* asterix_parse_with_offset(PyObject* self, PyObject* args) {
    const char* buffer;
    Py_ssize_t length;
    unsigned int offset;
    unsigned int blocks_count;
    int verbose = 0;

    if (!PyArg_ParseTuple(args, "y#II|p", &buffer, &length, &offset, &blocks_count, &verbose)) {
        return NULL;
    }

    // CRITICAL-002 FIX: Validate offset bounds
    if (offset >= (unsigned int)length) {
        PyErr_Format(PyExc_ValueError,
            "Offset %u exceeds data length %zd", offset, length);
        return NULL;
    }

    // Validate blocks_count (HIGH-002 equivalent)
    if (blocks_count > 10000) {
        PyErr_Format(PyExc_ValueError,
            "blocks_count %u exceeds maximum (10000)", blocks_count);
        return NULL;
    }

    return python_parse_with_offset(
        (const unsigned char*)buffer, length, offset, blocks_count, verbose);
}

Rust Binding

File: asterix-rs/src/ffi.rs

Key Features:

Type-safe FFI via CXX crate
Comprehensive input validation (Issues #29, CRITICAL-004, CRITICAL-005)
Zero unsafe code in public API
Idiomatic Rust error handling (Result types)

Example Validation:

pub fn parse(data: &[u8], options: ParseOptions) -> Result<Vec<AsterixRecord>> {
    // CRITICAL-005 FIX: Validate input data length
    if data.is_empty() {
        return Err(AsterixError::InvalidData("Empty input data".to_string()));
    }

    if data.len() > MAX_ASTERIX_MESSAGE_SIZE {
        return Err(AsterixError::InvalidData(format!(
            "Input data too large: {} bytes (maximum {} bytes)",
            data.len(), MAX_ASTERIX_MESSAGE_SIZE
        )));
    }

    unsafe {
        let data_ptr = ffi::ffi::asterix_parse(data.as_ptr(), data.len(), options.verbose);

        if data_ptr.is_null() {
            return Err(AsterixError::NullPointer(
                "C++ parser returned null (check if ASTERIX is initialized)".to_string(),
            ));
        }

        let result = convert_asterix_data(data_ptr, &options);
        ffi::ffi::asterix_free_data(data_ptr);
        result
    }
}

Checklist for New Bindings

Use this checklist when creating a new language binding:

References

ARCHITECTURE.md: Overall system architecture
PROTOCOL_INTEGRATION.md: Adding new protocol adapters
CLAUDE.md: Development guidelines
Issue #26: Safety-critical design patterns
Issue #29: FFI boundary security audit

Document Version: 1.0 Last Updated: 2025-11-06 Maintainers: ASTERIX Contributors License: GPL-3.0-or-later

Language Binding Guidelines

Table of Contents

Overview

Core Principles

1. Safety First

2. Zero-Copy Where Possible

3. Idiomatic APIs

4. Comprehensive Error Handling

Required API Surface

1. Initialization

2. Parsing

3. Metadata/Description

4. Configuration

FFI Boundary Design

Input Validation Checklist

Error Code Mapping

Memory Management

Testing Requirements

1. Test Coverage: >80%

2. Unit Tests

3. Integration Tests

4. Property-Based Tests

5. Memory Leak Tests

Security Requirements

1. Static Analysis

2. Fuzzing

3. ASAN/MSAN/UBSAN

4. CodeQL Analysis

Documentation Requirements

1. API Reference

2. Tutorials

Basic Usage

Advanced Usage

Incremental Parsing

Custom Categories

Rust Equivalent

Key Differences

2. Release Automation

3. Performance Regression Detection

Language-Specific Patterns

Python (C Extension)

Rust (CXX Bridge)

Node.js (N-API)

Reference Implementations

Python Binding

Rust Binding

Checklist for New Bindings

References