Core modules: - probe.rs: ffprobe execution logic - parser.rs: JSON parsing logic - output.rs: Output formatting - lib.rs: Library interface - main.rs: CLI entry point Features: - Extract video metadata using ffprobe - Parse video/audio/subtitle streams - Save to JSON file - Console summary output Documentation: - Added QUICKSTART.md - Added ENVIRONMENT_SETUP_REPORT.md
649 lines
16 KiB
Markdown
649 lines
16 KiB
Markdown
# video_probe (Rust) - 开发计划
|
||
|
||
## 项目概述
|
||
|
||
将 Python 版本的 `video_probe.py` 重写为 Rust 版本,作为独立的 Gitea 仓库。
|
||
|
||
**目标**: 高性能、跨平台的视频元数据提取工具
|
||
|
||
**输入**: 视频文件路径
|
||
**输出**: `<video_name>.probe.json` 文件
|
||
|
||
---
|
||
|
||
## 功能需求
|
||
|
||
### 核心功能
|
||
1. ✅ 使用 ffprobe 提取视频元数据
|
||
2. ✅ 解析 JSON 输出
|
||
3. ✅ 提取格式信息(format)
|
||
4. ✅ 提取视频流信息(video stream)
|
||
5. ✅ 提取音频流信息(audio streams)
|
||
6. ✅ 提取字幕流信息(subtitle streams)
|
||
7. ✅ 提取其他流信息(other streams)
|
||
8. ✅ 保存为格式化的 JSON 文件
|
||
9. ✅ 命令行参数解析
|
||
10. ✅ 友好的控制台输出
|
||
|
||
### 高级功能(可选)
|
||
- [ ] 批量处理多个视频文件
|
||
- [ ] 递归扫描目录
|
||
- [ ] 自定义输出路径
|
||
- [ ] 输出格式选项(JSON/YAML/TOML)
|
||
- [ ] 并行处理
|
||
- [ ] 进度条显示
|
||
- [ ] 错误容忍模式(跳过失败文件)
|
||
|
||
---
|
||
|
||
## 技术栈
|
||
|
||
### Rust 依赖
|
||
|
||
#### 核心依赖
|
||
```toml
|
||
[dependencies]
|
||
serde = { version = "1.0", features = ["derive"] }
|
||
serde_json = "1.0"
|
||
chrono = { version = "0.4", features = ["serde"] }
|
||
anyhow = "1.0"
|
||
thiserror = "1.0"
|
||
```
|
||
|
||
#### 命令行工具
|
||
```toml
|
||
clap = { version = "4.0", features = ["derive"] }
|
||
```
|
||
|
||
#### 可选增强
|
||
```toml
|
||
indicatif = "0.17" # 进度条
|
||
rayon = "1.8" # 并行处理
|
||
walkdir = "2.4" # 目录遍历
|
||
```
|
||
|
||
### 外部依赖
|
||
- **ffprobe**: 系统需安装 FFmpeg(与 Python 版本相同)
|
||
|
||
---
|
||
|
||
## 项目结构
|
||
|
||
```
|
||
video_probe/
|
||
├── Cargo.toml
|
||
├── Cargo.lock
|
||
├── README.md
|
||
├── LICENSE
|
||
├── .gitignore
|
||
├── src/
|
||
│ ├── main.rs # 入口点
|
||
│ ├── lib.rs # 库接口
|
||
│ ├── probe.rs # ffprobe 执行逻辑
|
||
│ ├── parser.rs # JSON 解析逻辑
|
||
│ ├── metadata.rs # 元数据结构定义
|
||
│ ├── output.rs # 输出格式化
|
||
│ └── error.rs # 错误处理
|
||
├── tests/
|
||
│ ├── integration_test.rs
|
||
│ └── fixtures/
|
||
│ └── sample.mp4
|
||
└── docs/
|
||
├── USAGE.md
|
||
└── DEVELOPMENT.md
|
||
```
|
||
|
||
---
|
||
|
||
## 开发步骤
|
||
|
||
### 阶段 1: 项目初始化(Day 1)
|
||
|
||
#### 1.1 创建 Cargo 项目
|
||
```bash
|
||
cargo new video_probe
|
||
cd video_probe
|
||
```
|
||
|
||
#### 1.2 配置 Cargo.toml
|
||
```toml
|
||
[package]
|
||
name = "video_probe"
|
||
version = "0.1.0"
|
||
edition = "2021"
|
||
authors = ["Your Name <your.email@example.com>"]
|
||
description = "Extract video metadata using ffprobe"
|
||
license = "MIT"
|
||
|
||
[dependencies]
|
||
serde = { version = "1.0", features = ["derive"] }
|
||
serde_json = "1.0"
|
||
chrono = { version = "0.4", features = ["serde"] }
|
||
anyhow = "1.0"
|
||
thiserror = "1.0"
|
||
clap = { version = "4.0", features = ["derive"] }
|
||
|
||
[dev-dependencies]
|
||
tempfile = "3.8"
|
||
```
|
||
|
||
#### 1.3 创建基础文件结构
|
||
```bash
|
||
mkdir -p src tests docs
|
||
touch src/{lib.rs,probe.rs,parser.rs,metadata.rs,output.rs,error.rs}
|
||
```
|
||
|
||
---
|
||
|
||
### 阶段 2: 核心数据结构(Day 1-2)
|
||
|
||
#### 2.1 定义元数据结构(`src/metadata.rs`)
|
||
|
||
```rust
|
||
use serde::{Deserialize, Serialize};
|
||
use chrono::{DateTime, Utc};
|
||
|
||
#[derive(Debug, Serialize, Deserialize)]
|
||
pub struct VideoMetadata {
|
||
pub video_path: String,
|
||
pub probed_at: DateTime<Utc>,
|
||
pub format: FormatInfo,
|
||
pub video_stream: Option<VideoStream>,
|
||
pub audio_streams: Vec<AudioStream>,
|
||
pub subtitle_streams: Vec<SubtitleStream>,
|
||
pub other_streams: Vec<OtherStream>,
|
||
}
|
||
|
||
#[derive(Debug, Serialize, Deserialize)]
|
||
pub struct FormatInfo {
|
||
pub filename: Option<String>,
|
||
pub format_name: Option<String>,
|
||
pub format_long_name: Option<String>,
|
||
pub duration: f64,
|
||
pub size: u64,
|
||
pub bit_rate: u64,
|
||
pub probe_score: Option<i32>,
|
||
#[serde(skip_serializing_if = "Option::is_none")]
|
||
pub tags: Option<serde_json::Value>,
|
||
}
|
||
|
||
#[derive(Debug, Serialize, Deserialize)]
|
||
pub struct VideoStream {
|
||
pub index: i32,
|
||
pub codec_name: Option<String>,
|
||
pub codec_long_name: Option<String>,
|
||
pub profile: Option<String>,
|
||
pub level: Option<i32>,
|
||
pub width: i32,
|
||
pub height: i32,
|
||
pub coded_width: Option<i32>,
|
||
pub coded_height: Option<i32>,
|
||
pub aspect_ratio: Option<String>,
|
||
pub pix_fmt: Option<String>,
|
||
pub field_order: Option<String>,
|
||
pub r_frame_rate: Option<String>,
|
||
pub avg_frame_rate: Option<String>,
|
||
pub time_base: Option<String>,
|
||
pub start_pts: Option<i64>,
|
||
pub start_time: f64,
|
||
pub duration: Option<f64>,
|
||
pub bit_rate: Option<u64>,
|
||
pub nb_frames: Option<u64>,
|
||
#[serde(skip_serializing_if = "Option::is_none")]
|
||
pub tags: Option<serde_json::Value>,
|
||
}
|
||
|
||
#[derive(Debug, Serialize, Deserialize)]
|
||
pub struct AudioStream {
|
||
pub index: i32,
|
||
pub codec_name: Option<String>,
|
||
pub codec_long_name: Option<String>,
|
||
pub profile: Option<String>,
|
||
pub channels: i32,
|
||
pub channel_layout: Option<String>,
|
||
pub sample_rate: Option<String>,
|
||
pub sample_fmt: Option<String>,
|
||
pub bit_rate: Option<u64>,
|
||
pub duration: Option<f64>,
|
||
#[serde(skip_serializing_if = "Option::is_none")]
|
||
pub tags: Option<serde_json::Value>,
|
||
}
|
||
|
||
#[derive(Debug, Serialize, Deserialize)]
|
||
pub struct SubtitleStream {
|
||
pub index: i32,
|
||
pub codec_name: Option<String>,
|
||
pub language: Option<String>,
|
||
#[serde(skip_serializing_if = "Option::is_none")]
|
||
pub tags: Option<serde_json::Value>,
|
||
}
|
||
|
||
#[derive(Debug, Serialize, Deserialize)]
|
||
pub struct OtherStream {
|
||
pub index: i32,
|
||
pub codec_type: String,
|
||
pub codec_name: Option<String>,
|
||
#[serde(skip_serializing_if = "Option::is_none")]
|
||
pub tags: Option<serde_json::Value>,
|
||
}
|
||
```
|
||
|
||
#### 2.2 定义错误类型(`src/error.rs`)
|
||
|
||
```rust
|
||
use thiserror::Error;
|
||
|
||
#[derive(Debug, Error)]
|
||
pub enum ProbeError {
|
||
#[error("Video file not found: {0}")]
|
||
FileNotFound(String),
|
||
|
||
#[error("Failed to execute ffprobe: {0}")]
|
||
FfprobeExecution(#[from] std::io::Error),
|
||
|
||
#[error("Failed to parse ffprobe output: {0}")]
|
||
ParseError(#[from] serde_json::Error),
|
||
|
||
#[error("ffprobe returned non-zero exit code: {0}")]
|
||
FfprobeFailed(String),
|
||
|
||
#[error("No video stream found")]
|
||
NoVideoStream,
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 阶段 3: ffprobe 执行逻辑(Day 2-3)
|
||
|
||
#### 3.1 实现 ffprobe 调用(`src/probe.rs`)
|
||
|
||
```rust
|
||
use std::process::Command;
|
||
use anyhow::Result;
|
||
use crate::error::ProbeError;
|
||
|
||
pub fn run_ffprobe(video_path: &str) -> Result<String> {
|
||
// 检查文件是否存在
|
||
if !std::path::Path::new(video_path).exists() {
|
||
return Err(ProbeError::FileNotFound(video_path.to_string()).into());
|
||
}
|
||
|
||
// 执行 ffprobe
|
||
let output = Command::new("ffprobe")
|
||
.args(&[
|
||
"-v", "quiet",
|
||
"-print_format", "json",
|
||
"-show_format",
|
||
"-show_streams",
|
||
video_path
|
||
])
|
||
.output()?;
|
||
|
||
// 检查退出码
|
||
if !output.status.success() {
|
||
let stderr = String::from_utf8_lossy(&output.stderr);
|
||
return Err(ProbeError::FfprobeFailed(stderr.to_string()).into());
|
||
}
|
||
|
||
// 返回 JSON 输出
|
||
let stdout = String::from_utf8(output.stdout)?;
|
||
Ok(stdout)
|
||
}
|
||
```
|
||
|
||
#### 3.2 实现并行版本(可选)
|
||
|
||
```rust
|
||
use rayon::prelude::*;
|
||
|
||
pub fn probe_videos_parallel(video_paths: &[&str]) -> Vec<Result<VideoMetadata>> {
|
||
video_paths.par_iter()
|
||
.map(|path| probe_video(path))
|
||
.collect()
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 阶段 4: JSON 解析逻辑(Day 3)
|
||
|
||
#### 4.1 实现 JSON 解析(`src/parser.rs`)
|
||
|
||
```rust
|
||
use serde_json::Value;
|
||
use anyhow::Result;
|
||
use crate::metadata::*;
|
||
|
||
#[derive(Debug, Deserialize)]
|
||
struct FfprobeOutput {
|
||
format: Option<Value>,
|
||
streams: Option<Vec<Value>>,
|
||
}
|
||
|
||
pub fn parse_ffprobe_json(json_str: &str, video_path: &str) -> Result<VideoMetadata> {
|
||
let ffprobe_data: FfprobeOutput = serde_json::from_str(json_str)?;
|
||
|
||
let mut metadata = VideoMetadata {
|
||
video_path: std::fs::canonicalize(video_path)?
|
||
.to_string_lossy()
|
||
.to_string(),
|
||
probed_at: chrono::Utc::now(),
|
||
format: FormatInfo::default(),
|
||
video_stream: None,
|
||
audio_streams: Vec::new(),
|
||
subtitle_streams: Vec::new(),
|
||
other_streams: Vec::new(),
|
||
};
|
||
|
||
// 解析 format
|
||
if let Some(fmt) = ffprobe_data.format {
|
||
metadata.format = parse_format(&fmt)?;
|
||
}
|
||
|
||
// 解析 streams
|
||
if let Some(streams) = ffprobe_data.streams {
|
||
for stream in streams {
|
||
let codec_type = stream.get("codec_type")
|
||
.and_then(|v| v.as_str())
|
||
.unwrap_or("");
|
||
|
||
match codec_type {
|
||
"video" => {
|
||
if metadata.video_stream.is_none() {
|
||
metadata.video_stream = Some(parse_video_stream(&stream)?);
|
||
}
|
||
}
|
||
"audio" => {
|
||
metadata.audio_streams.push(parse_audio_stream(&stream)?);
|
||
}
|
||
"subtitle" => {
|
||
metadata.subtitle_streams.push(parse_subtitle_stream(&stream)?);
|
||
}
|
||
_ => {
|
||
metadata.other_streams.push(parse_other_stream(&stream)?);
|
||
}
|
||
}
|
||
}
|
||
}
|
||
|
||
Ok(metadata)
|
||
}
|
||
|
||
fn parse_format(fmt: &Value) -> Result<FormatInfo> {
|
||
Ok(FormatInfo {
|
||
filename: fmt.get("filename").and_then(|v| v.as_str()).map(String::from),
|
||
format_name: fmt.get("format_name").and_then(|v| v.as_str()).map(String::from),
|
||
format_long_name: fmt.get("format_long_name").and_then(|v| v.as_str()).map(String::from),
|
||
duration: fmt.get("duration").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0.0),
|
||
size: fmt.get("size").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0),
|
||
bit_rate: fmt.get("bit_rate").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0),
|
||
probe_score: fmt.get("probe_score").and_then(|v| v.as_i64()).map(|i| i as i32),
|
||
tags: fmt.get("tags").cloned(),
|
||
})
|
||
}
|
||
|
||
// 类似地实现其他 parse_* 函数...
|
||
```
|
||
|
||
---
|
||
|
||
### 阶段 5: 输出和格式化(Day 3-4)
|
||
|
||
#### 5.1 实现输出逻辑(`src/output.rs`)
|
||
|
||
```rust
|
||
use std::path::Path;
|
||
use anyhow::Result;
|
||
use crate::metadata::VideoMetadata;
|
||
|
||
pub fn save_metadata(video_path: &str, metadata: &VideoMetadata) -> Result<String> {
|
||
let video_path = Path::new(video_path);
|
||
let video_dir = video_path.parent().unwrap_or(Path::new("."));
|
||
let video_name = video_path.file_stem()
|
||
.and_then(|s| s.to_str())
|
||
.unwrap_or("unknown");
|
||
|
||
let output_file = video_dir.join(format!("{}.probe.json", video_name));
|
||
|
||
let json = serde_json::to_string_pretty(metadata)?;
|
||
std::fs::write(&output_file, json)?;
|
||
|
||
Ok(output_file.to_string_lossy().to_string())
|
||
}
|
||
|
||
pub fn print_summary(metadata: &VideoMetadata) {
|
||
println!("✓ Video probed successfully!\n");
|
||
|
||
if let Some(ref filename) = metadata.format.filename {
|
||
println!("File: {}", filename);
|
||
}
|
||
|
||
if let Some(ref format_name) = metadata.format.format_long_name {
|
||
println!("Format: {}", format_name);
|
||
}
|
||
|
||
println!("Duration: {:.2} seconds", metadata.format.duration);
|
||
println!("Size: {:.2} MB", metadata.format.size as f64 / 1024.0 / 1024.0);
|
||
println!("Bit rate: {:.0} kbps", metadata.format.bit_rate as f64 / 1000.0);
|
||
|
||
if let Some(ref vs) = metadata.video_stream {
|
||
println!("\nVideo Stream:");
|
||
println!(" Codec: {} ({:?})",
|
||
vs.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
|
||
vs.profile);
|
||
println!(" Resolution: {}x{}", vs.width, vs.height);
|
||
println!(" Frame rate: {}", vs.r_frame_rate.as_ref().unwrap_or(&"N/A".to_string()));
|
||
println!(" Pixel format: {}", vs.pix_fmt.as_ref().unwrap_or(&"N/A".to_string()));
|
||
}
|
||
|
||
if !metadata.audio_streams.is_empty() {
|
||
println!("\nAudio Streams: {}", metadata.audio_streams.len());
|
||
for (i, audio) in metadata.audio_streams.iter().enumerate() {
|
||
println!(" [{}] {} - {} channels @ {} Hz",
|
||
i + 1,
|
||
audio.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
|
||
audio.channels,
|
||
audio.sample_rate.as_ref().unwrap_or(&"N/A".to_string()));
|
||
}
|
||
}
|
||
|
||
if !metadata.subtitle_streams.is_empty() {
|
||
println!("\nSubtitle Streams: {}", metadata.subtitle_streams.len());
|
||
for (i, sub) in metadata.subtitle_streams.iter().enumerate() {
|
||
println!(" [{}] {} ({:?})",
|
||
i + 1,
|
||
sub.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
|
||
sub.language);
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 阶段 6: 命令行界面(Day 4)
|
||
|
||
#### 6.1 实现主程序(`src/main.rs`)
|
||
|
||
```rust
|
||
use clap::Parser;
|
||
use anyhow::Result;
|
||
|
||
mod probe;
|
||
mod parser;
|
||
mod metadata;
|
||
mod output;
|
||
mod error;
|
||
|
||
#[derive(Parser, Debug)]
|
||
#[command(author, version, about, long_about = None)]
|
||
struct Args {
|
||
/// Video file path
|
||
video_path: String,
|
||
|
||
/// Output directory (default: same as video file)
|
||
#[arg(short, long)]
|
||
output: Option<String>,
|
||
|
||
/// Verbose output
|
||
#[arg(short, long)]
|
||
verbose: bool,
|
||
}
|
||
|
||
fn main() -> Result<()> {
|
||
let args = Args::parse();
|
||
|
||
println!("Probing video: {}", args.video_path);
|
||
println!("{}", "=".repeat(60));
|
||
|
||
// 执行 ffprobe
|
||
let json_output = probe::run_ffprobe(&args.video_path)?;
|
||
|
||
// 解析 JSON
|
||
let metadata = parser::parse_ffprobe_json(&json_output, &args.video_path)?;
|
||
|
||
// 保存到文件
|
||
let output_file = output::save_metadata(&args.video_path, &metadata)?;
|
||
|
||
// 打印摘要
|
||
output::print_summary(&metadata);
|
||
|
||
println!("\n✓ Metadata saved to: {}", output_file);
|
||
println!("{}", "=".repeat(60));
|
||
|
||
Ok(())
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 阶段 7: 测试(Day 5)
|
||
|
||
#### 7.1 单元测试
|
||
|
||
```rust
|
||
#[cfg(test)]
|
||
mod tests {
|
||
use super::*;
|
||
|
||
#[test]
|
||
fn test_parse_format() {
|
||
let json = r#"{
|
||
"filename": "test.mp4",
|
||
"format_name": "mov,mp4",
|
||
"duration": "120.5",
|
||
"size": "52428800",
|
||
"bit_rate": "3473408"
|
||
}"#;
|
||
|
||
let value: serde_json::Value = serde_json::from_str(json).unwrap();
|
||
let format = parse_format(&value).unwrap();
|
||
|
||
assert_eq!(format.filename, Some("test.mp4".to_string()));
|
||
assert_eq!(format.duration, 120.5);
|
||
}
|
||
}
|
||
```
|
||
|
||
#### 7.2 集成测试
|
||
|
||
```rust
|
||
#[test]
|
||
fn test_probe_video() {
|
||
let video_path = "tests/fixtures/sample.mp4";
|
||
let result = probe_video(video_path);
|
||
assert!(result.is_ok());
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 阶段 8: 文档和发布(Day 5-6)
|
||
|
||
#### 8.1 编写 README.md
|
||
|
||
```markdown
|
||
# video_probe
|
||
|
||
Extract video metadata using ffprobe (Rust version)
|
||
|
||
## Installation
|
||
|
||
```bash
|
||
cargo install video_probe
|
||
```
|
||
|
||
## Usage
|
||
|
||
```bash
|
||
video_probe video.mp4
|
||
```
|
||
|
||
## Features
|
||
|
||
- Fast and efficient (written in Rust)
|
||
- Cross-platform (Linux, macOS, Windows)
|
||
- Comprehensive metadata extraction
|
||
- JSON output format
|
||
- User-friendly console output
|
||
```
|
||
|
||
#### 8.2 发布到 crates.io
|
||
|
||
```bash
|
||
cargo publish
|
||
```
|
||
|
||
---
|
||
|
||
## 开发时间表
|
||
|
||
| 阶段 | 任务 | 预计时间 |
|
||
|------|------|----------|
|
||
| 1 | 项目初始化 | 0.5 天 |
|
||
| 2 | 数据结构定义 | 1 天 |
|
||
| 3 | ffprobe 执行逻辑 | 1 天 |
|
||
| 4 | JSON 解析逻辑 | 1 天 |
|
||
| 5 | 输出和格式化 | 0.5 天 |
|
||
| 6 | 命令行界面 | 0.5 天 |
|
||
| 7 | 测试 | 1 天 |
|
||
| 8 | 文档和发布 | 0.5 天 |
|
||
| **总计** | | **6 天** |
|
||
|
||
---
|
||
|
||
## 与 Python 版本的对比
|
||
|
||
| 特性 | Python 版本 | Rust 版本 | 优势 |
|
||
|------|-------------|-----------|------|
|
||
| 性能 | 中等 | 高 | 2-10x 更快 |
|
||
| 内存使用 | 较高 | 低 | 更高效 |
|
||
| 启动时间 | 慢 | 快 | 即时启动 |
|
||
| 部署 | 需要 Python | 单二进制 | 更简单 |
|
||
| 跨平台 | 是 | 是 | 相同 |
|
||
| 依赖管理 | pip | Cargo | Cargo 更好 |
|
||
| 类型安全 | 弱 | 强 | 编译时检查 |
|
||
| 并发支持 | 有限 | 优秀 | Rayon 并行 |
|
||
| 错误处理 | 异常 | Result | 更明确 |
|
||
|
||
---
|
||
|
||
## 下一步行动
|
||
|
||
1. ✅ 创建 Gitea 仓库 `video_probe`
|
||
2. ✅ 初始化 Cargo 项目
|
||
3. ✅ 实现核心功能
|
||
4. ✅ 添加测试
|
||
5. ✅ 编写文档
|
||
6. ✅ 发布到 crates.io(可选)
|
||
|
||
---
|
||
|
||
## 参考资料
|
||
|
||
- [Rust 文档](https://doc.rust-lang.org/)
|
||
- [serde 文档](https://serde.rs/)
|
||
- [clap 文档](https://docs.rs/clap/)
|
||
- [FFprobe 文档](https://ffmpeg.org/ffprobe.html)
|