174 lines
4.5 KiB
Markdown
174 lines
4.5 KiB
Markdown
# AGENTS.md - rust-scribe
|
|
|
|
## Project Overview
|
|
|
|
rust-scribe is a high-performance video/audio transcriber with timestamps using Rust and Whisper. It extracts audio from video files and transcribes them using the Whisper.cpp library.
|
|
|
|
## Build Commands
|
|
|
|
```bash
|
|
# Build the project (debug mode)
|
|
cargo build
|
|
|
|
# Build release version
|
|
cargo build --release
|
|
|
|
# Run the application
|
|
cargo run --release -- <input_file> --model <model_path> [options]
|
|
|
|
# Example usage:
|
|
cargo run --release -- video.mp4 --model models/ggml-base.bin --language zh
|
|
```
|
|
|
|
### Running Tests
|
|
|
|
```bash
|
|
# Run all tests
|
|
cargo test
|
|
|
|
# Run a single test by name
|
|
cargo test <test_name>
|
|
|
|
# Run tests with output
|
|
cargo test -- --nocapture
|
|
```
|
|
|
|
### Linting and Formatting
|
|
|
|
```bash
|
|
# Run clippy for linting
|
|
cargo clippy
|
|
|
|
# Fix clippy suggestions automatically
|
|
cargo clippy --fix
|
|
|
|
# Format code
|
|
cargo fmt
|
|
|
|
# Check formatting
|
|
cargo fmt --check
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
- **ffmpeg-next** (v8.0): FFmpeg bindings for audio extraction and resampling
|
|
- **whisper-rs** (v0.12): Rust bindings for Whisper.cpp
|
|
- **whisper-rs-sys** (v0.10): Low-level Whisper bindings
|
|
- **clap** (v4.5): CLI argument parsing
|
|
- **anyhow** (v1.0): Error handling
|
|
- **ndarray** (v0.15): Array operations
|
|
|
|
## Code Style Guidelines
|
|
|
|
### Formatting
|
|
- Use `cargo fmt` for consistent formatting
|
|
- 4-space indentation (Rust default)
|
|
- Maximum line length: 100 characters (default)
|
|
|
|
### Imports
|
|
- Group imports by crate: std → external → local
|
|
- Use absolute paths with `crate::` for internal modules
|
|
- Prefer bringing traits into scope when using them
|
|
|
|
```rust
|
|
use std::path::Path;
|
|
use anyhow::{Context, Result};
|
|
use clap::Parser;
|
|
use ffmpeg_next as ffmpeg;
|
|
```
|
|
|
|
### Naming Conventions
|
|
- **Variables/functions**: snake_case (e.g., `extract_audio_to_f32`, `audio_data`)
|
|
- **Types/Enums**: PascalCase (e.g., `Args`, `WhisperContext`)
|
|
- **Constants**: SCREAMING_SNAKE_CASE (e.g., `WHISPER_SAMPLE_RATE`)
|
|
- **Files**: snake_case (e.g., `main.rs`)
|
|
|
|
### Error Handling
|
|
- Use `anyhow::Result<T>` for application-level error handling
|
|
- Use `?` operator for propagating errors
|
|
- Use `Context` trait for adding context to errors
|
|
- Use `anyhow::bail!` for early returns with errors
|
|
- Provide descriptive error messages in Chinese or English
|
|
|
|
```rust
|
|
fn load_config() -> Result<Config> {
|
|
let file = File::open("config.toml")
|
|
.context("Failed to open config file")?;
|
|
// ...
|
|
}
|
|
```
|
|
|
|
### Unsafe Code
|
|
- Minimize unsafe code; isolate it in small, well-documented functions
|
|
- Use `unsafe` block only when necessary (e.g., FFI callbacks)
|
|
- Document preconditions and invariants
|
|
|
|
```rust
|
|
unsafe extern "C" fn progress_callback(...) {
|
|
// Document what this callback does
|
|
// Keep unsafe block minimal
|
|
}
|
|
```
|
|
|
|
### Documentation
|
|
- Add doc comments (`///`) for public functions
|
|
- Document parameters and return values
|
|
- Include usage examples for complex functions
|
|
|
|
### Performance Considerations
|
|
- Use `AtomicU64`/`AtomicBool` for global state in callbacks
|
|
- Pre-allocate vectors with `Vec::with_capacity()` when size is known
|
|
- Use `saturating_*` operations to prevent overflow
|
|
- Reuse objects instead of creating new ones in loops
|
|
|
|
### Type Annotations
|
|
- Prefer explicit types for function signatures
|
|
- Use type inference for obvious local variables
|
|
- Use primitive types (`u32`, `f64`, etc.) over aliases
|
|
|
|
### Control Flow
|
|
- Use early returns to reduce nesting
|
|
- Prefer `?` over `match` for simple error propagation
|
|
- Use `if let` for optional values when pattern matching is simple
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
rust-scribe/
|
|
├── src/
|
|
│ └── main.rs # Main application code
|
|
├── models/ # Whisper model files
|
|
├── Cargo.toml # Project manifest
|
|
└── .cargo/
|
|
└── config.toml # Cargo configuration
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### CLI Arguments
|
|
- `input_file` (positional): Path to video/audio file
|
|
- `--model`: Path to Whisper model file
|
|
- `--language`: Target language (optional, auto-detects if not specified)
|
|
- `--verbose`: Enable verbose output
|
|
|
|
### Model Requirements
|
|
Place Whisper model files (e.g., `ggml-base.bin`) in the `models/` directory.
|
|
|
|
## Common Tasks
|
|
|
|
### Adding a New Dependency
|
|
Add to `[dependencies]` section in `Cargo.toml`:
|
|
```toml
|
|
package_name = "version"
|
|
```
|
|
|
|
### Adding a New Feature
|
|
1. Implement the feature in a new function in `src/main.rs`
|
|
2. Add CLI argument if needed in `Args` struct
|
|
3. Test with sample audio/video files
|
|
|
|
### Debugging
|
|
- Use `eprintln!` for debug output (goes to stderr)
|
|
- Use `println!` for progress messages
|
|
- Enable `--verbose` flag for Whisper debug output
|