Audio Engine: onboarding a new component¶

This guide is for contributors who want to add a new pipeline component to the Audio Engine (app/audio_engine/). A component is a synchronous handler registered by component_id; the HTTP router builds an ordered list of { "component_id", "params" } and the pipeline runs each handler on a module-owned executor.

For end-user API usage (multipart fields, response shape), see Audio Engine.

Prerequisites¶

Handlers run inside a ThreadPoolExecutor or ProcessPoolExecutor — they must be plain synchronous functions. Do not perform heavy work or blocking I/O on the FastAPI event loop; the pipeline already schedules handlers with run_in_executor.
Secrets (API keys, regions) come from the environment (or your existing config pattern), never from client params JSON.
New Python dependencies must be added in a way that does not break the Language Server lockfile (optional dependency group / reviewed pins). Native libraries (e.g. for codecs) must be reflected in container images if you deploy that way.

1. Declare the canonical `component_id`¶

File: app/audio_engine/constants.py

Add a module-level constant for the string id (e.g. MY_COMPONENT = "my_component").
Add that same string to the ALL_COMPONENT_IDS frozenset.

The registry rejects registration for any id not present in ALL_COMPONENT_IDS, and the HTTP layer rejects unknown ids before execution.

Naming

Use lowercase snake_case ids to match existing components (noise_reduce, beg_silence_trimmer, …).

2. Implement the handler¶

File: app/audio_engine/components/<your_module>.py (one module per component is a good default)

Handler signature¶

Match the contract documented in app/audio_engine/registry.py:

def your_handler(
    audio_path: str,
    params: dict[str, Any],
    *,
    extra_files: list[str] | None = None,
) -> dict[str, Any]:
    ...

Argument	Meaning
`audio_path`	Path to the current working audio file on disk for this pipeline index (after any previous audio-producing components).
`params`	The `params` object from the client for this index only (defaults to `{}`). Treat keys as untrusted; validate and normalize.
`extra_files`	Reserved for multi-file components. Today the pipeline invokes the handler with positional `(audio_path, params)` only — see Multi-file components if you need additional paths.

Return value (what the pipeline reads)¶

The pipeline inspects the returned dict and builds each entry in result["components"]:

You return	Pipeline behaviour
`output_path` (str, path to a new file on disk)	Audio-producing: working audio for the next component becomes this file; the HTTP `components[]` entry gets `output_url` from your `output_url` key if set, otherwise from `output_path`. You are responsible for writing the file; the pipeline tracks temp outputs it switches to as the working buffer.
`metrics` (dict)	Metrics-only: working `audio_path` is unchanged; the response entry includes a `metrics` object. You may return both `output_path` and `metrics` if a step both transforms audio and attaches metrics.

On any uncaught exception, the request fails fast (typically HTTP 500 after logging).

Register the handler¶

At import time (bottom of the module is fine):

from app.audio_engine.registry import register

register("your_component_id", your_handler, pool_kind="thread")

`pool_kind`	When to use
`thread` (default)	I/O-bound work, NumPy/C extensions that release the GIL, most SDK calls.
`process`	Pure-Python CPU-bound work where the GIL hurts; uses the module `ProcessPoolExecutor` (picklable handler / constraints apply).

3. Ensure the module is imported (registration side effect)¶

File: app/audio_engine/components/__init__.py

Add an import so the package load runs your module’s register(...):

from app.audio_engine.components import your_module  # noqa: F401

Lifespan already does import app.audio_engine.components when Audio Engine starts, so no change to lifespan.py is usually required once the __init__.py import exists.

4. HTTP validation (optional)¶

File: app/audio_engine/router.py

If the component needs request-level rules (extra files, param shape, placement in the pipeline), add validation before run_pipeline, alongside existing checks (e.g. speech_similarity, slice_audio).

Keep validation fast and non-blocking; do not open large files on the event loop beyond what the router already does for uploads.

Per-component `params` (Pydantic)¶

File: app/audio_engine/payload_validator.py

For components that accept JSON params, define a small BaseModel with model_config = ConfigDict(extra="forbid"), register it in _COMPONENT_PARAM_MODELS keyed by the constant from constants.py, and add unit tests for invalid keys / out-of-range values. The router validates each step’s params before the pipeline runs; handlers receive the validated dict (see noise_reduce, audio_duration_check, speech_duration_measurement, slice_audio for examples).

5. Tests¶

Add focused tests under tests/unit/app/audio_engine/:

Test type	Purpose
Handler unit tests	Mock filesystem / SDKs; assert return dict shape (`output_path`, `metrics`, errors on bad input).
Registry	After importing your module, `component_id` appears in `registered_ids()` (if you need a smoke test).
Router	Optional: multipart request with your `component_id` and mocked `run_pipeline` or patched dependencies.

Run:

uv run pytest tests/unit/app/audio_engine/ -vv

6. Documentation¶

User guide: update mkdocs/docs/audio_engine.md — component catalog table (component_id, params, output, implemented = Yes).

This guide: if you introduce a new cross-cutting rule (e.g. env vars, pool choice), add a short subsection here.

Multi-file components¶

The router saves uploads to temp paths: files[0] is the working audio; files[1:] are available as extra_file_paths inside run_pipeline but are not passed into handlers yet. If your component needs them (e.g. speech_similarity):

Extend run_pipeline in app/audio_engine/pipeline.py to call the handler with extra paths (e.g. functools.partial(handler, extra_files=extra_file_paths) or a small wrapper), or
Bind at registration time only if acceptable for your design.

Coordinate with reviewers so all components keep a consistent calling convention.

Checklist¶

Constant + ALL_COMPONENT_IDS in constants.py
Handler module under app/audio_engine/components/, synchronous, validates params
register(...) with correct pool_kind
Import in components/__init__.py
Router validation if the API contract needs special rules
payload_validator.py model + _COMPONENT_PARAM_MODELS entry if the step accepts non-empty params
Unit tests (+ uv run ruff check / uv run ruff format on touched files)
audio_engine.md catalog updated

Reference implementation¶

Topic	Example in codebase
Audio-producing, temp output file	`app/audio_engine/components/noise_reduce.py`
Metrics-only handler (`metrics` in result, no `output_path`)	`app/audio_engine/components/audio_duration_check.py`, `app/audio_engine/components/speech_duration_measurement.py`
Metrics-only tests (mocked heavy deps)	`tests/unit/app/audio_engine/test_speech_duration_measurement.py`
Registry API	`app/audio_engine/registry.py`
Pipeline merge of return dict	`app/audio_engine/pipeline.py`
Typed `params` with `extra="forbid"`	`app/audio_engine/payload_validator.py` (`SpeechDurationMeasurementParams`, `NoiseReduceParams`, …)

Path	Role
`app/audio_engine/constants.py`	Canonical ids
`app/audio_engine/registry.py`	`register` / `get`
`app/audio_engine/pipeline.py`	Sequential execution, result envelope
`app/audio_engine/router.py`	Multipart API, admission control, validation
`app/audio_engine/payload_validator.py`	Common form fields + per-component `params` schemas
`app/audio_engine/config.py`	Limits and feature flag
`app/audio_engine/lifespan.py`	Executor pools and component package import