# AGENTS.md - rust-scribe ## Project Overview rust-scribe is a high-performance video/audio transcriber with timestamps using Rust and Whisper. It extracts audio from video files and transcribes them using the Whisper.cpp library. ## Build Commands ```bash # Build the project (debug mode) cargo build # Build release version cargo build --release # Run the application cargo run --release -- --model [options] # Example usage: cargo run --release -- video.mp4 --model models/ggml-base.bin --language zh ``` ### Running Tests ```bash # Run all tests cargo test # Run a single test by name cargo test # Run tests with output cargo test -- --nocapture ``` ### Linting and Formatting ```bash # Run clippy for linting cargo clippy # Fix clippy suggestions automatically cargo clippy --fix # Format code cargo fmt # Check formatting cargo fmt --check ``` ## Dependencies - **ffmpeg-next** (v8.0): FFmpeg bindings for audio extraction and resampling - **whisper-rs** (v0.12): Rust bindings for Whisper.cpp - **whisper-rs-sys** (v0.10): Low-level Whisper bindings - **clap** (v4.5): CLI argument parsing - **anyhow** (v1.0): Error handling - **ndarray** (v0.15): Array operations ## Code Style Guidelines ### Formatting - Use `cargo fmt` for consistent formatting - 4-space indentation (Rust default) - Maximum line length: 100 characters (default) ### Imports - Group imports by crate: std → external → local - Use absolute paths with `crate::` for internal modules - Prefer bringing traits into scope when using them ```rust use std::path::Path; use anyhow::{Context, Result}; use clap::Parser; use ffmpeg_next as ffmpeg; ``` ### Naming Conventions - **Variables/functions**: snake_case (e.g., `extract_audio_to_f32`, `audio_data`) - **Types/Enums**: PascalCase (e.g., `Args`, `WhisperContext`) - **Constants**: SCREAMING_SNAKE_CASE (e.g., `WHISPER_SAMPLE_RATE`) - **Files**: snake_case (e.g., `main.rs`) ### Error Handling - Use `anyhow::Result` for application-level error handling - Use `?` operator for propagating errors - Use `Context` trait for adding context to errors - Use `anyhow::bail!` for early returns with errors - Provide descriptive error messages in Chinese or English ```rust fn load_config() -> Result { let file = File::open("config.toml") .context("Failed to open config file")?; // ... } ``` ### Unsafe Code - Minimize unsafe code; isolate it in small, well-documented functions - Use `unsafe` block only when necessary (e.g., FFI callbacks) - Document preconditions and invariants ```rust unsafe extern "C" fn progress_callback(...) { // Document what this callback does // Keep unsafe block minimal } ``` ### Documentation - Add doc comments (`///`) for public functions - Document parameters and return values - Include usage examples for complex functions ### Performance Considerations - Use `AtomicU64`/`AtomicBool` for global state in callbacks - Pre-allocate vectors with `Vec::with_capacity()` when size is known - Use `saturating_*` operations to prevent overflow - Reuse objects instead of creating new ones in loops ### Type Annotations - Prefer explicit types for function signatures - Use type inference for obvious local variables - Use primitive types (`u32`, `f64`, etc.) over aliases ### Control Flow - Use early returns to reduce nesting - Prefer `?` over `match` for simple error propagation - Use `if let` for optional values when pattern matching is simple ## Project Structure ``` rust-scribe/ ├── src/ │ └── main.rs # Main application code ├── models/ # Whisper model files ├── Cargo.toml # Project manifest └── .cargo/ └── config.toml # Cargo configuration ``` ## Configuration ### CLI Arguments - `input_file` (positional): Path to video/audio file - `--model`: Path to Whisper model file - `--language`: Target language (optional, auto-detects if not specified) - `--verbose`: Enable verbose output ### Model Requirements Place Whisper model files (e.g., `ggml-base.bin`) in the `models/` directory. ## Common Tasks ### Adding a New Dependency Add to `[dependencies]` section in `Cargo.toml`: ```toml package_name = "version" ``` ### Adding a New Feature 1. Implement the feature in a new function in `src/main.rs` 2. Add CLI argument if needed in `Args` struct 3. Test with sample audio/video files ### Debugging - Use `eprintln!` for debug output (goes to stderr) - Use `println!` for progress messages - Enable `--verbose` flag for Whisper debug output