4.5 KiB
4.5 KiB
AGENTS.md - rust-scribe
Project Overview
rust-scribe is a high-performance video/audio transcriber with timestamps using Rust and Whisper. It extracts audio from video files and transcribes them using the Whisper.cpp library.
Build Commands
# Build the project (debug mode)
cargo build
# Build release version
cargo build --release
# Run the application
cargo run --release -- <input_file> --model <model_path> [options]
# Example usage:
cargo run --release -- video.mp4 --model models/ggml-base.bin --language zh
Running Tests
# Run all tests
cargo test
# Run a single test by name
cargo test <test_name>
# Run tests with output
cargo test -- --nocapture
Linting and Formatting
# Run clippy for linting
cargo clippy
# Fix clippy suggestions automatically
cargo clippy --fix
# Format code
cargo fmt
# Check formatting
cargo fmt --check
Dependencies
- ffmpeg-next (v8.0): FFmpeg bindings for audio extraction and resampling
- whisper-rs (v0.12): Rust bindings for Whisper.cpp
- whisper-rs-sys (v0.10): Low-level Whisper bindings
- clap (v4.5): CLI argument parsing
- anyhow (v1.0): Error handling
- ndarray (v0.15): Array operations
Code Style Guidelines
Formatting
- Use
cargo fmtfor consistent formatting - 4-space indentation (Rust default)
- Maximum line length: 100 characters (default)
Imports
- Group imports by crate: std → external → local
- Use absolute paths with
crate::for internal modules - Prefer bringing traits into scope when using them
use std::path::Path;
use anyhow::{Context, Result};
use clap::Parser;
use ffmpeg_next as ffmpeg;
Naming Conventions
- Variables/functions: snake_case (e.g.,
extract_audio_to_f32,audio_data) - Types/Enums: PascalCase (e.g.,
Args,WhisperContext) - Constants: SCREAMING_SNAKE_CASE (e.g.,
WHISPER_SAMPLE_RATE) - Files: snake_case (e.g.,
main.rs)
Error Handling
- Use
anyhow::Result<T>for application-level error handling - Use
?operator for propagating errors - Use
Contexttrait for adding context to errors - Use
anyhow::bail!for early returns with errors - Provide descriptive error messages in Chinese or English
fn load_config() -> Result<Config> {
let file = File::open("config.toml")
.context("Failed to open config file")?;
// ...
}
Unsafe Code
- Minimize unsafe code; isolate it in small, well-documented functions
- Use
unsafeblock only when necessary (e.g., FFI callbacks) - Document preconditions and invariants
unsafe extern "C" fn progress_callback(...) {
// Document what this callback does
// Keep unsafe block minimal
}
Documentation
- Add doc comments (
///) for public functions - Document parameters and return values
- Include usage examples for complex functions
Performance Considerations
- Use
AtomicU64/AtomicBoolfor global state in callbacks - Pre-allocate vectors with
Vec::with_capacity()when size is known - Use
saturating_*operations to prevent overflow - Reuse objects instead of creating new ones in loops
Type Annotations
- Prefer explicit types for function signatures
- Use type inference for obvious local variables
- Use primitive types (
u32,f64, etc.) over aliases
Control Flow
- Use early returns to reduce nesting
- Prefer
?overmatchfor simple error propagation - Use
if letfor optional values when pattern matching is simple
Project Structure
rust-scribe/
├── src/
│ └── main.rs # Main application code
├── models/ # Whisper model files
├── Cargo.toml # Project manifest
└── .cargo/
└── config.toml # Cargo configuration
Configuration
CLI Arguments
input_file(positional): Path to video/audio file--model: Path to Whisper model file--language: Target language (optional, auto-detects if not specified)--verbose: Enable verbose output
Model Requirements
Place Whisper model files (e.g., ggml-base.bin) in the models/ directory.
Common Tasks
Adding a New Dependency
Add to [dependencies] section in Cargo.toml:
package_name = "version"
Adding a New Feature
- Implement the feature in a new function in
src/main.rs - Add CLI argument if needed in
Argsstruct - Test with sample audio/video files
Debugging
- Use
eprintln!for debug output (goes to stderr) - Use
println!for progress messages - Enable
--verboseflag for Whisper debug output