Skip to content

Audio Engine: onboarding a new component

This guide is for contributors who want to add a new pipeline component to the Audio Engine (app/audio_engine/). A component is a synchronous handler registered by component_id; the HTTP router builds an ordered list of { "component_id", "params" } and the pipeline runs each handler on a module-owned executor.

For end-user API usage (multipart fields, response shape), see Audio Engine.


Prerequisites

  • Handlers run inside a ThreadPoolExecutor or ProcessPoolExecutor — they must be plain synchronous functions. Do not perform heavy work or blocking I/O on the FastAPI event loop; the pipeline already schedules handlers with run_in_executor.
  • Secrets (API keys, regions) come from the environment (or your existing config pattern), never from client params JSON.
  • New Python dependencies must be added in a way that does not break the Language Server lockfile (optional dependency group / reviewed pins). Native libraries (e.g. for codecs) must be reflected in container images if you deploy that way.

1. Declare the canonical component_id

File: app/audio_engine/constants.py

  1. Add a module-level constant for the string id (e.g. MY_COMPONENT = "my_component").
  2. Add that same string to the ALL_COMPONENT_IDS frozenset.

The registry rejects registration for any id not present in ALL_COMPONENT_IDS, and the HTTP layer rejects unknown ids before execution.

Naming

Use lowercase snake_case ids to match existing components (noise_reduce, beg_silence_trimmer, …).


2. Implement the handler

File: app/audio_engine/components/<your_module>.py (one module per component is a good default)

Handler signature

Match the contract documented in app/audio_engine/registry.py:

def your_handler(
    audio_path: str,
    params: dict[str, Any],
    *,
    extra_files: list[str] | None = None,
) -> dict[str, Any]:
    ...
Argument Meaning
audio_path Path to the current working audio file on disk for this pipeline index (after any previous audio-producing components).
params The params object from the client for this index only (defaults to {}). Treat keys as untrusted; validate and normalize.
extra_files Reserved for multi-file components. Today the pipeline invokes the handler with positional (audio_path, params) only — see Multi-file components if you need additional paths.

Return value (what the pipeline reads)

The pipeline inspects the returned dict and builds each entry in result["components"]:

You return Pipeline behaviour
output_path (str, path to a new file on disk) Audio-producing: working audio for the next component becomes this file; the HTTP components[] entry gets output_url from your output_url key if set, otherwise from output_path. You are responsible for writing the file; the pipeline tracks temp outputs it switches to as the working buffer.
metrics (dict) Metrics-only: working audio_path is unchanged; the response entry includes a metrics object. You may return both output_path and metrics if a step both transforms audio and attaches metrics.

On any uncaught exception, the request fails fast (typically HTTP 500 after logging).

Register the handler

At import time (bottom of the module is fine):

from app.audio_engine.registry import register

register("your_component_id", your_handler, pool_kind="thread")
pool_kind When to use
thread (default) I/O-bound work, NumPy/C extensions that release the GIL, most SDK calls.
process Pure-Python CPU-bound work where the GIL hurts; uses the module ProcessPoolExecutor (picklable handler / constraints apply).

3. Ensure the module is imported (registration side effect)

File: app/audio_engine/components/__init__.py

Add an import so the package load runs your module’s register(...):

from app.audio_engine.components import your_module  # noqa: F401

Lifespan already does import app.audio_engine.components when Audio Engine starts, so no change to lifespan.py is usually required once the __init__.py import exists.


4. HTTP validation (optional)

File: app/audio_engine/router.py

If the component needs request-level rules (extra files, param shape, placement in the pipeline), add validation before run_pipeline, alongside existing checks (e.g. speech_similarity, slice_audio).

Keep validation fast and non-blocking; do not open large files on the event loop beyond what the router already does for uploads.

Per-component params (Pydantic)

File: app/audio_engine/payload_validator.py

For components that accept JSON params, define a small BaseModel with model_config = ConfigDict(extra="forbid"), register it in _COMPONENT_PARAM_MODELS keyed by the constant from constants.py, and add unit tests for invalid keys / out-of-range values. The router validates each step’s params before the pipeline runs; handlers receive the validated dict (see noise_reduce, audio_duration_check, speech_duration_measurement, slice_audio for examples).


5. Tests

Add focused tests under tests/unit/app/audio_engine/:

Test type Purpose
Handler unit tests Mock filesystem / SDKs; assert return dict shape (output_path, metrics, errors on bad input).
Registry After importing your module, component_id appears in registered_ids() (if you need a smoke test).
Router Optional: multipart request with your component_id and mocked run_pipeline or patched dependencies.

Run:

uv run pytest tests/unit/app/audio_engine/ -vv

6. Documentation

User guide: update mkdocs/docs/audio_engine.md — component catalog table (component_id, params, output, implemented = Yes).

This guide: if you introduce a new cross-cutting rule (e.g. env vars, pool choice), add a short subsection here.


Multi-file components

The router saves uploads to temp paths: files[0] is the working audio; files[1:] are available as extra_file_paths inside run_pipeline but are not passed into handlers yet. If your component needs them (e.g. speech_similarity):

  1. Extend run_pipeline in app/audio_engine/pipeline.py to call the handler with extra paths (e.g. functools.partial(handler, extra_files=extra_file_paths) or a small wrapper), or
  2. Bind at registration time only if acceptable for your design.

Coordinate with reviewers so all components keep a consistent calling convention.


Checklist

  • Constant + ALL_COMPONENT_IDS in constants.py
  • Handler module under app/audio_engine/components/, synchronous, validates params
  • register(...) with correct pool_kind
  • Import in components/__init__.py
  • Router validation if the API contract needs special rules
  • payload_validator.py model + _COMPONENT_PARAM_MODELS entry if the step accepts non-empty params
  • Unit tests (+ uv run ruff check / uv run ruff format on touched files)
  • audio_engine.md catalog updated

Reference implementation

Topic Example in codebase
Audio-producing, temp output file app/audio_engine/components/noise_reduce.py
Metrics-only handler (metrics in result, no output_path) app/audio_engine/components/audio_duration_check.py, app/audio_engine/components/speech_duration_measurement.py
Metrics-only tests (mocked heavy deps) tests/unit/app/audio_engine/test_speech_duration_measurement.py
Registry API app/audio_engine/registry.py
Pipeline merge of return dict app/audio_engine/pipeline.py
Typed params with extra="forbid" app/audio_engine/payload_validator.py (SpeechDurationMeasurementParams, NoiseReduceParams, …)

Path Role
app/audio_engine/constants.py Canonical ids
app/audio_engine/registry.py register / get
app/audio_engine/pipeline.py Sequential execution, result envelope
app/audio_engine/router.py Multipart API, admission control, validation
app/audio_engine/payload_validator.py Common form fields + per-component params schemas
app/audio_engine/config.py Limits and feature flag
app/audio_engine/lifespan.py Executor pools and component package import