Audio Engine: onboarding a new component¶
This guide is for contributors who want to add a new pipeline component to the Audio Engine (app/audio_engine/). A component is a synchronous handler registered by component_id; the HTTP router builds an ordered list of { "component_id", "params" } and the pipeline runs each handler on a module-owned executor.
For end-user API usage (multipart fields, response shape), see Audio Engine.
Prerequisites¶
- Handlers run inside a
ThreadPoolExecutororProcessPoolExecutor— they must be plain synchronous functions. Do not perform heavy work or blocking I/O on the FastAPI event loop; the pipeline already schedules handlers withrun_in_executor. - Secrets (API keys, regions) come from the environment (or your existing config pattern), never from client
paramsJSON. - New Python dependencies must be added in a way that does not break the Language Server lockfile (optional dependency group / reviewed pins). Native libraries (e.g. for codecs) must be reflected in container images if you deploy that way.
1. Declare the canonical component_id¶
File: app/audio_engine/constants.py
- Add a module-level constant for the string id (e.g.
MY_COMPONENT = "my_component"). - Add that same string to the
ALL_COMPONENT_IDSfrozenset.
The registry rejects registration for any id not present in ALL_COMPONENT_IDS, and the HTTP layer rejects unknown ids before execution.
Naming
Use lowercase snake_case ids to match existing components (noise_reduce, beg_silence_trimmer, …).
2. Implement the handler¶
File: app/audio_engine/components/<your_module>.py (one module per component is a good default)
Handler signature¶
Match the contract documented in app/audio_engine/registry.py:
def your_handler(
audio_path: str,
params: dict[str, Any],
*,
extra_files: list[str] | None = None,
) -> dict[str, Any]:
...
| Argument | Meaning |
|---|---|
audio_path |
Path to the current working audio file on disk for this pipeline index (after any previous audio-producing components). |
params |
The params object from the client for this index only (defaults to {}). Treat keys as untrusted; validate and normalize. |
extra_files |
Reserved for multi-file components. Today the pipeline invokes the handler with positional (audio_path, params) only — see Multi-file components if you need additional paths. |
Return value (what the pipeline reads)¶
The pipeline inspects the returned dict and builds each entry in result["components"]:
| You return | Pipeline behaviour |
|---|---|
output_path (str, path to a new file on disk) |
Audio-producing: working audio for the next component becomes this file; the HTTP components[] entry gets output_url from your output_url key if set, otherwise from output_path. You are responsible for writing the file; the pipeline tracks temp outputs it switches to as the working buffer. |
metrics (dict) |
Metrics-only: working audio_path is unchanged; the response entry includes a metrics object. You may return both output_path and metrics if a step both transforms audio and attaches metrics. |
On any uncaught exception, the request fails fast (typically HTTP 500 after logging).
Register the handler¶
At import time (bottom of the module is fine):
from app.audio_engine.registry import register
register("your_component_id", your_handler, pool_kind="thread")
pool_kind |
When to use |
|---|---|
thread (default) |
I/O-bound work, NumPy/C extensions that release the GIL, most SDK calls. |
process |
Pure-Python CPU-bound work where the GIL hurts; uses the module ProcessPoolExecutor (picklable handler / constraints apply). |
3. Ensure the module is imported (registration side effect)¶
File: app/audio_engine/components/__init__.py
Add an import so the package load runs your module’s register(...):
Lifespan already does import app.audio_engine.components when Audio Engine starts, so no change to lifespan.py is usually required once the __init__.py import exists.
4. HTTP validation (optional)¶
File: app/audio_engine/router.py
If the component needs request-level rules (extra files, param shape, placement in the pipeline), add validation before run_pipeline, alongside existing checks (e.g. speech_similarity, slice_audio).
Keep validation fast and non-blocking; do not open large files on the event loop beyond what the router already does for uploads.
Per-component params (Pydantic)¶
File: app/audio_engine/payload_validator.py
For components that accept JSON params, define a small BaseModel with model_config = ConfigDict(extra="forbid"), register it in _COMPONENT_PARAM_MODELS keyed by the constant from constants.py, and add unit tests for invalid keys / out-of-range values. The router validates each step’s params before the pipeline runs; handlers receive the validated dict (see noise_reduce, audio_duration_check, speech_duration_measurement, slice_audio for examples).
5. Tests¶
Add focused tests under tests/unit/app/audio_engine/:
| Test type | Purpose |
|---|---|
| Handler unit tests | Mock filesystem / SDKs; assert return dict shape (output_path, metrics, errors on bad input). |
| Registry | After importing your module, component_id appears in registered_ids() (if you need a smoke test). |
| Router | Optional: multipart request with your component_id and mocked run_pipeline or patched dependencies. |
Run:
6. Documentation¶
User guide: update mkdocs/docs/audio_engine.md — component catalog table (component_id, params, output, implemented = Yes).
This guide: if you introduce a new cross-cutting rule (e.g. env vars, pool choice), add a short subsection here.
Multi-file components¶
The router saves uploads to temp paths: files[0] is the working audio; files[1:] are available as extra_file_paths inside run_pipeline but are not passed into handlers yet. If your component needs them (e.g. speech_similarity):
- Extend
run_pipelineinapp/audio_engine/pipeline.pyto call the handler with extra paths (e.g.functools.partial(handler, extra_files=extra_file_paths)or a small wrapper), or - Bind at registration time only if acceptable for your design.
Coordinate with reviewers so all components keep a consistent calling convention.
Checklist¶
- Constant +
ALL_COMPONENT_IDSinconstants.py - Handler module under
app/audio_engine/components/, synchronous, validatesparams -
register(...)with correctpool_kind - Import in
components/__init__.py - Router validation if the API contract needs special rules
-
payload_validator.pymodel +_COMPONENT_PARAM_MODELSentry if the step accepts non-emptyparams - Unit tests (+
uv run ruff check/uv run ruff formaton touched files) -
audio_engine.mdcatalog updated
Reference implementation¶
| Topic | Example in codebase |
|---|---|
| Audio-producing, temp output file | app/audio_engine/components/noise_reduce.py |
Metrics-only handler (metrics in result, no output_path) |
app/audio_engine/components/audio_duration_check.py, app/audio_engine/components/speech_duration_measurement.py |
| Metrics-only tests (mocked heavy deps) | tests/unit/app/audio_engine/test_speech_duration_measurement.py |
| Registry API | app/audio_engine/registry.py |
| Pipeline merge of return dict | app/audio_engine/pipeline.py |
Typed params with extra="forbid" |
app/audio_engine/payload_validator.py (SpeechDurationMeasurementParams, NoiseReduceParams, …) |
Related paths¶
| Path | Role |
|---|---|
app/audio_engine/constants.py |
Canonical ids |
app/audio_engine/registry.py |
register / get |
app/audio_engine/pipeline.py |
Sequential execution, result envelope |
app/audio_engine/router.py |
Multipart API, admission control, validation |
app/audio_engine/payload_validator.py |
Common form fields + per-component params schemas |
app/audio_engine/config.py |
Limits and feature flag |
app/audio_engine/lifespan.py |
Executor pools and component package import |