Files
rust-scribe/AGENTS.md

4.5 KiB

AGENTS.md - rust-scribe

Project Overview

rust-scribe is a high-performance video/audio transcriber with timestamps using Rust and Whisper. It extracts audio from video files and transcribes them using the Whisper.cpp library.

Build Commands

# Build the project (debug mode)
cargo build

# Build release version
cargo build --release

# Run the application
cargo run --release -- <input_file> --model <model_path> [options]

# Example usage:
cargo run --release -- video.mp4 --model models/ggml-base.bin --language zh

Running Tests

# Run all tests
cargo test

# Run a single test by name
cargo test <test_name>

# Run tests with output
cargo test -- --nocapture

Linting and Formatting

# Run clippy for linting
cargo clippy

# Fix clippy suggestions automatically
cargo clippy --fix

# Format code
cargo fmt

# Check formatting
cargo fmt --check

Dependencies

  • ffmpeg-next (v8.0): FFmpeg bindings for audio extraction and resampling
  • whisper-rs (v0.12): Rust bindings for Whisper.cpp
  • whisper-rs-sys (v0.10): Low-level Whisper bindings
  • clap (v4.5): CLI argument parsing
  • anyhow (v1.0): Error handling
  • ndarray (v0.15): Array operations

Code Style Guidelines

Formatting

  • Use cargo fmt for consistent formatting
  • 4-space indentation (Rust default)
  • Maximum line length: 100 characters (default)

Imports

  • Group imports by crate: std → external → local
  • Use absolute paths with crate:: for internal modules
  • Prefer bringing traits into scope when using them
use std::path::Path;
use anyhow::{Context, Result};
use clap::Parser;
use ffmpeg_next as ffmpeg;

Naming Conventions

  • Variables/functions: snake_case (e.g., extract_audio_to_f32, audio_data)
  • Types/Enums: PascalCase (e.g., Args, WhisperContext)
  • Constants: SCREAMING_SNAKE_CASE (e.g., WHISPER_SAMPLE_RATE)
  • Files: snake_case (e.g., main.rs)

Error Handling

  • Use anyhow::Result<T> for application-level error handling
  • Use ? operator for propagating errors
  • Use Context trait for adding context to errors
  • Use anyhow::bail! for early returns with errors
  • Provide descriptive error messages in Chinese or English
fn load_config() -> Result<Config> {
    let file = File::open("config.toml")
        .context("Failed to open config file")?;
    // ...
}

Unsafe Code

  • Minimize unsafe code; isolate it in small, well-documented functions
  • Use unsafe block only when necessary (e.g., FFI callbacks)
  • Document preconditions and invariants
unsafe extern "C" fn progress_callback(...) {
    // Document what this callback does
    // Keep unsafe block minimal
}

Documentation

  • Add doc comments (///) for public functions
  • Document parameters and return values
  • Include usage examples for complex functions

Performance Considerations

  • Use AtomicU64/AtomicBool for global state in callbacks
  • Pre-allocate vectors with Vec::with_capacity() when size is known
  • Use saturating_* operations to prevent overflow
  • Reuse objects instead of creating new ones in loops

Type Annotations

  • Prefer explicit types for function signatures
  • Use type inference for obvious local variables
  • Use primitive types (u32, f64, etc.) over aliases

Control Flow

  • Use early returns to reduce nesting
  • Prefer ? over match for simple error propagation
  • Use if let for optional values when pattern matching is simple

Project Structure

rust-scribe/
├── src/
│   └── main.rs           # Main application code
├── models/               # Whisper model files
├── Cargo.toml           # Project manifest
└── .cargo/
    └── config.toml      # Cargo configuration

Configuration

CLI Arguments

  • input_file (positional): Path to video/audio file
  • --model: Path to Whisper model file
  • --language: Target language (optional, auto-detects if not specified)
  • --verbose: Enable verbose output

Model Requirements

Place Whisper model files (e.g., ggml-base.bin) in the models/ directory.

Common Tasks

Adding a New Dependency

Add to [dependencies] section in Cargo.toml:

package_name = "version"

Adding a New Feature

  1. Implement the feature in a new function in src/main.rs
  2. Add CLI argument if needed in Args struct
  3. Test with sample audio/video files

Debugging

  • Use eprintln! for debug output (goes to stderr)
  • Use println! for progress messages
  • Enable --verbose flag for Whisper debug output