1
0

Initial implementation of video_probe (Rust)

Core modules:
- probe.rs: ffprobe execution logic
- parser.rs: JSON parsing logic
- output.rs: Output formatting
- lib.rs: Library interface
- main.rs: CLI entry point

Features:
- Extract video metadata using ffprobe
- Parse video/audio/subtitle streams
- Save to JSON file
- Console summary output

Documentation:
- Added QUICKSTART.md
- Added ENVIRONMENT_SETUP_REPORT.md
This commit is contained in:
accusys
2026-03-07 10:10:19 +08:00
commit f3e2d2dca7
464 changed files with 125611 additions and 0 deletions

View File

@@ -0,0 +1,648 @@
# video_probe (Rust) - 开发计划
## 项目概述
将 Python 版本的 `video_probe.py` 重写为 Rust 版本,作为独立的 Gitea 仓库。
**目标**: 高性能、跨平台的视频元数据提取工具
**输入**: 视频文件路径
**输出**: `<video_name>.probe.json` 文件
---
## 功能需求
### 核心功能
1. ✅ 使用 ffprobe 提取视频元数据
2. ✅ 解析 JSON 输出
3. ✅ 提取格式信息format
4. ✅ 提取视频流信息video stream
5. ✅ 提取音频流信息audio streams
6. ✅ 提取字幕流信息subtitle streams
7. ✅ 提取其他流信息other streams
8. ✅ 保存为格式化的 JSON 文件
9. ✅ 命令行参数解析
10. ✅ 友好的控制台输出
### 高级功能(可选)
- [ ] 批量处理多个视频文件
- [ ] 递归扫描目录
- [ ] 自定义输出路径
- [ ] 输出格式选项JSON/YAML/TOML
- [ ] 并行处理
- [ ] 进度条显示
- [ ] 错误容忍模式(跳过失败文件)
---
## 技术栈
### Rust 依赖
#### 核心依赖
```toml
[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
chrono = { version = "0.4", features = ["serde"] }
anyhow = "1.0"
thiserror = "1.0"
```
#### 命令行工具
```toml
clap = { version = "4.0", features = ["derive"] }
```
#### 可选增强
```toml
indicatif = "0.17" # 进度条
rayon = "1.8" # 并行处理
walkdir = "2.4" # 目录遍历
```
### 外部依赖
- **ffprobe**: 系统需安装 FFmpeg与 Python 版本相同)
---
## 项目结构
```
video_probe/
├── Cargo.toml
├── Cargo.lock
├── README.md
├── LICENSE
├── .gitignore
├── src/
│ ├── main.rs # 入口点
│ ├── lib.rs # 库接口
│ ├── probe.rs # ffprobe 执行逻辑
│ ├── parser.rs # JSON 解析逻辑
│ ├── metadata.rs # 元数据结构定义
│ ├── output.rs # 输出格式化
│ └── error.rs # 错误处理
├── tests/
│ ├── integration_test.rs
│ └── fixtures/
│ └── sample.mp4
└── docs/
├── USAGE.md
└── DEVELOPMENT.md
```
---
## 开发步骤
### 阶段 1: 项目初始化Day 1
#### 1.1 创建 Cargo 项目
```bash
cargo new video_probe
cd video_probe
```
#### 1.2 配置 Cargo.toml
```toml
[package]
name = "video_probe"
version = "0.1.0"
edition = "2021"
authors = ["Your Name <your.email@example.com>"]
description = "Extract video metadata using ffprobe"
license = "MIT"
[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
chrono = { version = "0.4", features = ["serde"] }
anyhow = "1.0"
thiserror = "1.0"
clap = { version = "4.0", features = ["derive"] }
[dev-dependencies]
tempfile = "3.8"
```
#### 1.3 创建基础文件结构
```bash
mkdir -p src tests docs
touch src/{lib.rs,probe.rs,parser.rs,metadata.rs,output.rs,error.rs}
```
---
### 阶段 2: 核心数据结构Day 1-2
#### 2.1 定义元数据结构(`src/metadata.rs`
```rust
use serde::{Deserialize, Serialize};
use chrono::{DateTime, Utc};
#[derive(Debug, Serialize, Deserialize)]
pub struct VideoMetadata {
pub video_path: String,
pub probed_at: DateTime<Utc>,
pub format: FormatInfo,
pub video_stream: Option<VideoStream>,
pub audio_streams: Vec<AudioStream>,
pub subtitle_streams: Vec<SubtitleStream>,
pub other_streams: Vec<OtherStream>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct FormatInfo {
pub filename: Option<String>,
pub format_name: Option<String>,
pub format_long_name: Option<String>,
pub duration: f64,
pub size: u64,
pub bit_rate: u64,
pub probe_score: Option<i32>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tags: Option<serde_json::Value>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct VideoStream {
pub index: i32,
pub codec_name: Option<String>,
pub codec_long_name: Option<String>,
pub profile: Option<String>,
pub level: Option<i32>,
pub width: i32,
pub height: i32,
pub coded_width: Option<i32>,
pub coded_height: Option<i32>,
pub aspect_ratio: Option<String>,
pub pix_fmt: Option<String>,
pub field_order: Option<String>,
pub r_frame_rate: Option<String>,
pub avg_frame_rate: Option<String>,
pub time_base: Option<String>,
pub start_pts: Option<i64>,
pub start_time: f64,
pub duration: Option<f64>,
pub bit_rate: Option<u64>,
pub nb_frames: Option<u64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tags: Option<serde_json::Value>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct AudioStream {
pub index: i32,
pub codec_name: Option<String>,
pub codec_long_name: Option<String>,
pub profile: Option<String>,
pub channels: i32,
pub channel_layout: Option<String>,
pub sample_rate: Option<String>,
pub sample_fmt: Option<String>,
pub bit_rate: Option<u64>,
pub duration: Option<f64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tags: Option<serde_json::Value>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct SubtitleStream {
pub index: i32,
pub codec_name: Option<String>,
pub language: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tags: Option<serde_json::Value>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct OtherStream {
pub index: i32,
pub codec_type: String,
pub codec_name: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tags: Option<serde_json::Value>,
}
```
#### 2.2 定义错误类型(`src/error.rs`
```rust
use thiserror::Error;
#[derive(Debug, Error)]
pub enum ProbeError {
#[error("Video file not found: {0}")]
FileNotFound(String),
#[error("Failed to execute ffprobe: {0}")]
FfprobeExecution(#[from] std::io::Error),
#[error("Failed to parse ffprobe output: {0}")]
ParseError(#[from] serde_json::Error),
#[error("ffprobe returned non-zero exit code: {0}")]
FfprobeFailed(String),
#[error("No video stream found")]
NoVideoStream,
}
```
---
### 阶段 3: ffprobe 执行逻辑Day 2-3
#### 3.1 实现 ffprobe 调用(`src/probe.rs`
```rust
use std::process::Command;
use anyhow::Result;
use crate::error::ProbeError;
pub fn run_ffprobe(video_path: &str) -> Result<String> {
// 检查文件是否存在
if !std::path::Path::new(video_path).exists() {
return Err(ProbeError::FileNotFound(video_path.to_string()).into());
}
// 执行 ffprobe
let output = Command::new("ffprobe")
.args(&[
"-v", "quiet",
"-print_format", "json",
"-show_format",
"-show_streams",
video_path
])
.output()?;
// 检查退出码
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
return Err(ProbeError::FfprobeFailed(stderr.to_string()).into());
}
// 返回 JSON 输出
let stdout = String::from_utf8(output.stdout)?;
Ok(stdout)
}
```
#### 3.2 实现并行版本(可选)
```rust
use rayon::prelude::*;
pub fn probe_videos_parallel(video_paths: &[&str]) -> Vec<Result<VideoMetadata>> {
video_paths.par_iter()
.map(|path| probe_video(path))
.collect()
}
```
---
### 阶段 4: JSON 解析逻辑Day 3
#### 4.1 实现 JSON 解析(`src/parser.rs`
```rust
use serde_json::Value;
use anyhow::Result;
use crate::metadata::*;
#[derive(Debug, Deserialize)]
struct FfprobeOutput {
format: Option<Value>,
streams: Option<Vec<Value>>,
}
pub fn parse_ffprobe_json(json_str: &str, video_path: &str) -> Result<VideoMetadata> {
let ffprobe_data: FfprobeOutput = serde_json::from_str(json_str)?;
let mut metadata = VideoMetadata {
video_path: std::fs::canonicalize(video_path)?
.to_string_lossy()
.to_string(),
probed_at: chrono::Utc::now(),
format: FormatInfo::default(),
video_stream: None,
audio_streams: Vec::new(),
subtitle_streams: Vec::new(),
other_streams: Vec::new(),
};
// 解析 format
if let Some(fmt) = ffprobe_data.format {
metadata.format = parse_format(&fmt)?;
}
// 解析 streams
if let Some(streams) = ffprobe_data.streams {
for stream in streams {
let codec_type = stream.get("codec_type")
.and_then(|v| v.as_str())
.unwrap_or("");
match codec_type {
"video" => {
if metadata.video_stream.is_none() {
metadata.video_stream = Some(parse_video_stream(&stream)?);
}
}
"audio" => {
metadata.audio_streams.push(parse_audio_stream(&stream)?);
}
"subtitle" => {
metadata.subtitle_streams.push(parse_subtitle_stream(&stream)?);
}
_ => {
metadata.other_streams.push(parse_other_stream(&stream)?);
}
}
}
}
Ok(metadata)
}
fn parse_format(fmt: &Value) -> Result<FormatInfo> {
Ok(FormatInfo {
filename: fmt.get("filename").and_then(|v| v.as_str()).map(String::from),
format_name: fmt.get("format_name").and_then(|v| v.as_str()).map(String::from),
format_long_name: fmt.get("format_long_name").and_then(|v| v.as_str()).map(String::from),
duration: fmt.get("duration").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0.0),
size: fmt.get("size").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0),
bit_rate: fmt.get("bit_rate").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0),
probe_score: fmt.get("probe_score").and_then(|v| v.as_i64()).map(|i| i as i32),
tags: fmt.get("tags").cloned(),
})
}
// 类似地实现其他 parse_* 函数...
```
---
### 阶段 5: 输出和格式化Day 3-4
#### 5.1 实现输出逻辑(`src/output.rs`
```rust
use std::path::Path;
use anyhow::Result;
use crate::metadata::VideoMetadata;
pub fn save_metadata(video_path: &str, metadata: &VideoMetadata) -> Result<String> {
let video_path = Path::new(video_path);
let video_dir = video_path.parent().unwrap_or(Path::new("."));
let video_name = video_path.file_stem()
.and_then(|s| s.to_str())
.unwrap_or("unknown");
let output_file = video_dir.join(format!("{}.probe.json", video_name));
let json = serde_json::to_string_pretty(metadata)?;
std::fs::write(&output_file, json)?;
Ok(output_file.to_string_lossy().to_string())
}
pub fn print_summary(metadata: &VideoMetadata) {
println!("✓ Video probed successfully!\n");
if let Some(ref filename) = metadata.format.filename {
println!("File: {}", filename);
}
if let Some(ref format_name) = metadata.format.format_long_name {
println!("Format: {}", format_name);
}
println!("Duration: {:.2} seconds", metadata.format.duration);
println!("Size: {:.2} MB", metadata.format.size as f64 / 1024.0 / 1024.0);
println!("Bit rate: {:.0} kbps", metadata.format.bit_rate as f64 / 1000.0);
if let Some(ref vs) = metadata.video_stream {
println!("\nVideo Stream:");
println!(" Codec: {} ({:?})",
vs.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
vs.profile);
println!(" Resolution: {}x{}", vs.width, vs.height);
println!(" Frame rate: {}", vs.r_frame_rate.as_ref().unwrap_or(&"N/A".to_string()));
println!(" Pixel format: {}", vs.pix_fmt.as_ref().unwrap_or(&"N/A".to_string()));
}
if !metadata.audio_streams.is_empty() {
println!("\nAudio Streams: {}", metadata.audio_streams.len());
for (i, audio) in metadata.audio_streams.iter().enumerate() {
println!(" [{}] {} - {} channels @ {} Hz",
i + 1,
audio.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
audio.channels,
audio.sample_rate.as_ref().unwrap_or(&"N/A".to_string()));
}
}
if !metadata.subtitle_streams.is_empty() {
println!("\nSubtitle Streams: {}", metadata.subtitle_streams.len());
for (i, sub) in metadata.subtitle_streams.iter().enumerate() {
println!(" [{}] {} ({:?})",
i + 1,
sub.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
sub.language);
}
}
}
```
---
### 阶段 6: 命令行界面Day 4
#### 6.1 实现主程序(`src/main.rs`
```rust
use clap::Parser;
use anyhow::Result;
mod probe;
mod parser;
mod metadata;
mod output;
mod error;
#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
struct Args {
/// Video file path
video_path: String,
/// Output directory (default: same as video file)
#[arg(short, long)]
output: Option<String>,
/// Verbose output
#[arg(short, long)]
verbose: bool,
}
fn main() -> Result<()> {
let args = Args::parse();
println!("Probing video: {}", args.video_path);
println!("{}", "=".repeat(60));
// 执行 ffprobe
let json_output = probe::run_ffprobe(&args.video_path)?;
// 解析 JSON
let metadata = parser::parse_ffprobe_json(&json_output, &args.video_path)?;
// 保存到文件
let output_file = output::save_metadata(&args.video_path, &metadata)?;
// 打印摘要
output::print_summary(&metadata);
println!("\n✓ Metadata saved to: {}", output_file);
println!("{}", "=".repeat(60));
Ok(())
}
```
---
### 阶段 7: 测试Day 5
#### 7.1 单元测试
```rust
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_format() {
let json = r#"{
"filename": "test.mp4",
"format_name": "mov,mp4",
"duration": "120.5",
"size": "52428800",
"bit_rate": "3473408"
}"#;
let value: serde_json::Value = serde_json::from_str(json).unwrap();
let format = parse_format(&value).unwrap();
assert_eq!(format.filename, Some("test.mp4".to_string()));
assert_eq!(format.duration, 120.5);
}
}
```
#### 7.2 集成测试
```rust
#[test]
fn test_probe_video() {
let video_path = "tests/fixtures/sample.mp4";
let result = probe_video(video_path);
assert!(result.is_ok());
}
```
---
### 阶段 8: 文档和发布Day 5-6
#### 8.1 编写 README.md
```markdown
# video_probe
Extract video metadata using ffprobe (Rust version)
## Installation
```bash
cargo install video_probe
```
## Usage
```bash
video_probe video.mp4
```
## Features
- Fast and efficient (written in Rust)
- Cross-platform (Linux, macOS, Windows)
- Comprehensive metadata extraction
- JSON output format
- User-friendly console output
```
#### 8.2 发布到 crates.io
```bash
cargo publish
```
---
## 开发时间表
| 阶段 | 任务 | 预计时间 |
|------|------|----------|
| 1 | 项目初始化 | 0.5 天 |
| 2 | 数据结构定义 | 1 天 |
| 3 | ffprobe 执行逻辑 | 1 天 |
| 4 | JSON 解析逻辑 | 1 天 |
| 5 | 输出和格式化 | 0.5 天 |
| 6 | 命令行界面 | 0.5 天 |
| 7 | 测试 | 1 天 |
| 8 | 文档和发布 | 0.5 天 |
| **总计** | | **6 天** |
---
## 与 Python 版本的对比
| 特性 | Python 版本 | Rust 版本 | 优势 |
|------|-------------|-----------|------|
| 性能 | 中等 | 高 | 2-10x 更快 |
| 内存使用 | 较高 | 低 | 更高效 |
| 启动时间 | 慢 | 快 | 即时启动 |
| 部署 | 需要 Python | 单二进制 | 更简单 |
| 跨平台 | 是 | 是 | 相同 |
| 依赖管理 | pip | Cargo | Cargo 更好 |
| 类型安全 | 弱 | 强 | 编译时检查 |
| 并发支持 | 有限 | 优秀 | Rayon 并行 |
| 错误处理 | 异常 | Result | 更明确 |
---
## 下一步行动
1. ✅ 创建 Gitea 仓库 `video_probe`
2. ✅ 初始化 Cargo 项目
3. ✅ 实现核心功能
4. ✅ 添加测试
5. ✅ 编写文档
6. ✅ 发布到 crates.io可选
---
## 参考资料
- [Rust 文档](https://doc.rust-lang.org/)
- [serde 文档](https://serde.rs/)
- [clap 文档](https://docs.rs/clap/)
- [FFprobe 文档](https://ffmpeg.org/ffprobe.html)