Core modules: - probe.rs: ffprobe execution logic - parser.rs: JSON parsing logic - output.rs: Output formatting - lib.rs: Library interface - main.rs: CLI entry point Features: - Extract video metadata using ffprobe - Parse video/audio/subtitle streams - Save to JSON file - Console summary output Documentation: - Added QUICKSTART.md - Added ENVIRONMENT_SETUP_REPORT.md
16 KiB
16 KiB
video_probe (Rust) - 开发计划
项目概述
将 Python 版本的 video_probe.py 重写为 Rust 版本,作为独立的 Gitea 仓库。
目标: 高性能、跨平台的视频元数据提取工具
输入: 视频文件路径
输出: <video_name>.probe.json 文件
功能需求
核心功能
- ✅ 使用 ffprobe 提取视频元数据
- ✅ 解析 JSON 输出
- ✅ 提取格式信息(format)
- ✅ 提取视频流信息(video stream)
- ✅ 提取音频流信息(audio streams)
- ✅ 提取字幕流信息(subtitle streams)
- ✅ 提取其他流信息(other streams)
- ✅ 保存为格式化的 JSON 文件
- ✅ 命令行参数解析
- ✅ 友好的控制台输出
高级功能(可选)
- 批量处理多个视频文件
- 递归扫描目录
- 自定义输出路径
- 输出格式选项(JSON/YAML/TOML)
- 并行处理
- 进度条显示
- 错误容忍模式(跳过失败文件)
技术栈
Rust 依赖
核心依赖
[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
chrono = { version = "0.4", features = ["serde"] }
anyhow = "1.0"
thiserror = "1.0"
命令行工具
clap = { version = "4.0", features = ["derive"] }
可选增强
indicatif = "0.17" # 进度条
rayon = "1.8" # 并行处理
walkdir = "2.4" # 目录遍历
外部依赖
- ffprobe: 系统需安装 FFmpeg(与 Python 版本相同)
项目结构
video_probe/
├── Cargo.toml
├── Cargo.lock
├── README.md
├── LICENSE
├── .gitignore
├── src/
│ ├── main.rs # 入口点
│ ├── lib.rs # 库接口
│ ├── probe.rs # ffprobe 执行逻辑
│ ├── parser.rs # JSON 解析逻辑
│ ├── metadata.rs # 元数据结构定义
│ ├── output.rs # 输出格式化
│ └── error.rs # 错误处理
├── tests/
│ ├── integration_test.rs
│ └── fixtures/
│ └── sample.mp4
└── docs/
├── USAGE.md
└── DEVELOPMENT.md
开发步骤
阶段 1: 项目初始化(Day 1)
1.1 创建 Cargo 项目
cargo new video_probe
cd video_probe
1.2 配置 Cargo.toml
[package]
name = "video_probe"
version = "0.1.0"
edition = "2021"
authors = ["Your Name <your.email@example.com>"]
description = "Extract video metadata using ffprobe"
license = "MIT"
[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
chrono = { version = "0.4", features = ["serde"] }
anyhow = "1.0"
thiserror = "1.0"
clap = { version = "4.0", features = ["derive"] }
[dev-dependencies]
tempfile = "3.8"
1.3 创建基础文件结构
mkdir -p src tests docs
touch src/{lib.rs,probe.rs,parser.rs,metadata.rs,output.rs,error.rs}
阶段 2: 核心数据结构(Day 1-2)
2.1 定义元数据结构(src/metadata.rs)
use serde::{Deserialize, Serialize};
use chrono::{DateTime, Utc};
#[derive(Debug, Serialize, Deserialize)]
pub struct VideoMetadata {
pub video_path: String,
pub probed_at: DateTime<Utc>,
pub format: FormatInfo,
pub video_stream: Option<VideoStream>,
pub audio_streams: Vec<AudioStream>,
pub subtitle_streams: Vec<SubtitleStream>,
pub other_streams: Vec<OtherStream>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct FormatInfo {
pub filename: Option<String>,
pub format_name: Option<String>,
pub format_long_name: Option<String>,
pub duration: f64,
pub size: u64,
pub bit_rate: u64,
pub probe_score: Option<i32>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tags: Option<serde_json::Value>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct VideoStream {
pub index: i32,
pub codec_name: Option<String>,
pub codec_long_name: Option<String>,
pub profile: Option<String>,
pub level: Option<i32>,
pub width: i32,
pub height: i32,
pub coded_width: Option<i32>,
pub coded_height: Option<i32>,
pub aspect_ratio: Option<String>,
pub pix_fmt: Option<String>,
pub field_order: Option<String>,
pub r_frame_rate: Option<String>,
pub avg_frame_rate: Option<String>,
pub time_base: Option<String>,
pub start_pts: Option<i64>,
pub start_time: f64,
pub duration: Option<f64>,
pub bit_rate: Option<u64>,
pub nb_frames: Option<u64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tags: Option<serde_json::Value>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct AudioStream {
pub index: i32,
pub codec_name: Option<String>,
pub codec_long_name: Option<String>,
pub profile: Option<String>,
pub channels: i32,
pub channel_layout: Option<String>,
pub sample_rate: Option<String>,
pub sample_fmt: Option<String>,
pub bit_rate: Option<u64>,
pub duration: Option<f64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tags: Option<serde_json::Value>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct SubtitleStream {
pub index: i32,
pub codec_name: Option<String>,
pub language: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tags: Option<serde_json::Value>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct OtherStream {
pub index: i32,
pub codec_type: String,
pub codec_name: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tags: Option<serde_json::Value>,
}
2.2 定义错误类型(src/error.rs)
use thiserror::Error;
#[derive(Debug, Error)]
pub enum ProbeError {
#[error("Video file not found: {0}")]
FileNotFound(String),
#[error("Failed to execute ffprobe: {0}")]
FfprobeExecution(#[from] std::io::Error),
#[error("Failed to parse ffprobe output: {0}")]
ParseError(#[from] serde_json::Error),
#[error("ffprobe returned non-zero exit code: {0}")]
FfprobeFailed(String),
#[error("No video stream found")]
NoVideoStream,
}
阶段 3: ffprobe 执行逻辑(Day 2-3)
3.1 实现 ffprobe 调用(src/probe.rs)
use std::process::Command;
use anyhow::Result;
use crate::error::ProbeError;
pub fn run_ffprobe(video_path: &str) -> Result<String> {
// 检查文件是否存在
if !std::path::Path::new(video_path).exists() {
return Err(ProbeError::FileNotFound(video_path.to_string()).into());
}
// 执行 ffprobe
let output = Command::new("ffprobe")
.args(&[
"-v", "quiet",
"-print_format", "json",
"-show_format",
"-show_streams",
video_path
])
.output()?;
// 检查退出码
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
return Err(ProbeError::FfprobeFailed(stderr.to_string()).into());
}
// 返回 JSON 输出
let stdout = String::from_utf8(output.stdout)?;
Ok(stdout)
}
3.2 实现并行版本(可选)
use rayon::prelude::*;
pub fn probe_videos_parallel(video_paths: &[&str]) -> Vec<Result<VideoMetadata>> {
video_paths.par_iter()
.map(|path| probe_video(path))
.collect()
}
阶段 4: JSON 解析逻辑(Day 3)
4.1 实现 JSON 解析(src/parser.rs)
use serde_json::Value;
use anyhow::Result;
use crate::metadata::*;
#[derive(Debug, Deserialize)]
struct FfprobeOutput {
format: Option<Value>,
streams: Option<Vec<Value>>,
}
pub fn parse_ffprobe_json(json_str: &str, video_path: &str) -> Result<VideoMetadata> {
let ffprobe_data: FfprobeOutput = serde_json::from_str(json_str)?;
let mut metadata = VideoMetadata {
video_path: std::fs::canonicalize(video_path)?
.to_string_lossy()
.to_string(),
probed_at: chrono::Utc::now(),
format: FormatInfo::default(),
video_stream: None,
audio_streams: Vec::new(),
subtitle_streams: Vec::new(),
other_streams: Vec::new(),
};
// 解析 format
if let Some(fmt) = ffprobe_data.format {
metadata.format = parse_format(&fmt)?;
}
// 解析 streams
if let Some(streams) = ffprobe_data.streams {
for stream in streams {
let codec_type = stream.get("codec_type")
.and_then(|v| v.as_str())
.unwrap_or("");
match codec_type {
"video" => {
if metadata.video_stream.is_none() {
metadata.video_stream = Some(parse_video_stream(&stream)?);
}
}
"audio" => {
metadata.audio_streams.push(parse_audio_stream(&stream)?);
}
"subtitle" => {
metadata.subtitle_streams.push(parse_subtitle_stream(&stream)?);
}
_ => {
metadata.other_streams.push(parse_other_stream(&stream)?);
}
}
}
}
Ok(metadata)
}
fn parse_format(fmt: &Value) -> Result<FormatInfo> {
Ok(FormatInfo {
filename: fmt.get("filename").and_then(|v| v.as_str()).map(String::from),
format_name: fmt.get("format_name").and_then(|v| v.as_str()).map(String::from),
format_long_name: fmt.get("format_long_name").and_then(|v| v.as_str()).map(String::from),
duration: fmt.get("duration").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0.0),
size: fmt.get("size").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0),
bit_rate: fmt.get("bit_rate").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0),
probe_score: fmt.get("probe_score").and_then(|v| v.as_i64()).map(|i| i as i32),
tags: fmt.get("tags").cloned(),
})
}
// 类似地实现其他 parse_* 函数...
阶段 5: 输出和格式化(Day 3-4)
5.1 实现输出逻辑(src/output.rs)
use std::path::Path;
use anyhow::Result;
use crate::metadata::VideoMetadata;
pub fn save_metadata(video_path: &str, metadata: &VideoMetadata) -> Result<String> {
let video_path = Path::new(video_path);
let video_dir = video_path.parent().unwrap_or(Path::new("."));
let video_name = video_path.file_stem()
.and_then(|s| s.to_str())
.unwrap_or("unknown");
let output_file = video_dir.join(format!("{}.probe.json", video_name));
let json = serde_json::to_string_pretty(metadata)?;
std::fs::write(&output_file, json)?;
Ok(output_file.to_string_lossy().to_string())
}
pub fn print_summary(metadata: &VideoMetadata) {
println!("✓ Video probed successfully!\n");
if let Some(ref filename) = metadata.format.filename {
println!("File: {}", filename);
}
if let Some(ref format_name) = metadata.format.format_long_name {
println!("Format: {}", format_name);
}
println!("Duration: {:.2} seconds", metadata.format.duration);
println!("Size: {:.2} MB", metadata.format.size as f64 / 1024.0 / 1024.0);
println!("Bit rate: {:.0} kbps", metadata.format.bit_rate as f64 / 1000.0);
if let Some(ref vs) = metadata.video_stream {
println!("\nVideo Stream:");
println!(" Codec: {} ({:?})",
vs.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
vs.profile);
println!(" Resolution: {}x{}", vs.width, vs.height);
println!(" Frame rate: {}", vs.r_frame_rate.as_ref().unwrap_or(&"N/A".to_string()));
println!(" Pixel format: {}", vs.pix_fmt.as_ref().unwrap_or(&"N/A".to_string()));
}
if !metadata.audio_streams.is_empty() {
println!("\nAudio Streams: {}", metadata.audio_streams.len());
for (i, audio) in metadata.audio_streams.iter().enumerate() {
println!(" [{}] {} - {} channels @ {} Hz",
i + 1,
audio.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
audio.channels,
audio.sample_rate.as_ref().unwrap_or(&"N/A".to_string()));
}
}
if !metadata.subtitle_streams.is_empty() {
println!("\nSubtitle Streams: {}", metadata.subtitle_streams.len());
for (i, sub) in metadata.subtitle_streams.iter().enumerate() {
println!(" [{}] {} ({:?})",
i + 1,
sub.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
sub.language);
}
}
}
阶段 6: 命令行界面(Day 4)
6.1 实现主程序(src/main.rs)
use clap::Parser;
use anyhow::Result;
mod probe;
mod parser;
mod metadata;
mod output;
mod error;
#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
struct Args {
/// Video file path
video_path: String,
/// Output directory (default: same as video file)
#[arg(short, long)]
output: Option<String>,
/// Verbose output
#[arg(short, long)]
verbose: bool,
}
fn main() -> Result<()> {
let args = Args::parse();
println!("Probing video: {}", args.video_path);
println!("{}", "=".repeat(60));
// 执行 ffprobe
let json_output = probe::run_ffprobe(&args.video_path)?;
// 解析 JSON
let metadata = parser::parse_ffprobe_json(&json_output, &args.video_path)?;
// 保存到文件
let output_file = output::save_metadata(&args.video_path, &metadata)?;
// 打印摘要
output::print_summary(&metadata);
println!("\n✓ Metadata saved to: {}", output_file);
println!("{}", "=".repeat(60));
Ok(())
}
阶段 7: 测试(Day 5)
7.1 单元测试
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_format() {
let json = r#"{
"filename": "test.mp4",
"format_name": "mov,mp4",
"duration": "120.5",
"size": "52428800",
"bit_rate": "3473408"
}"#;
let value: serde_json::Value = serde_json::from_str(json).unwrap();
let format = parse_format(&value).unwrap();
assert_eq!(format.filename, Some("test.mp4".to_string()));
assert_eq!(format.duration, 120.5);
}
}
7.2 集成测试
#[test]
fn test_probe_video() {
let video_path = "tests/fixtures/sample.mp4";
let result = probe_video(video_path);
assert!(result.is_ok());
}
阶段 8: 文档和发布(Day 5-6)
8.1 编写 README.md
# video_probe
Extract video metadata using ffprobe (Rust version)
## Installation
```bash
cargo install video_probe
Usage
video_probe video.mp4
Features
- Fast and efficient (written in Rust)
- Cross-platform (Linux, macOS, Windows)
- Comprehensive metadata extraction
- JSON output format
- User-friendly console output
#### 8.2 发布到 crates.io
```bash
cargo publish
开发时间表
| 阶段 | 任务 | 预计时间 |
|---|---|---|
| 1 | 项目初始化 | 0.5 天 |
| 2 | 数据结构定义 | 1 天 |
| 3 | ffprobe 执行逻辑 | 1 天 |
| 4 | JSON 解析逻辑 | 1 天 |
| 5 | 输出和格式化 | 0.5 天 |
| 6 | 命令行界面 | 0.5 天 |
| 7 | 测试 | 1 天 |
| 8 | 文档和发布 | 0.5 天 |
| 总计 | 6 天 |
与 Python 版本的对比
| 特性 | Python 版本 | Rust 版本 | 优势 |
|---|---|---|---|
| 性能 | 中等 | 高 | 2-10x 更快 |
| 内存使用 | 较高 | 低 | 更高效 |
| 启动时间 | 慢 | 快 | 即时启动 |
| 部署 | 需要 Python | 单二进制 | 更简单 |
| 跨平台 | 是 | 是 | 相同 |
| 依赖管理 | pip | Cargo | Cargo 更好 |
| 类型安全 | 弱 | 强 | 编译时检查 |
| 并发支持 | 有限 | 优秀 | Rayon 并行 |
| 错误处理 | 异常 | Result | 更明确 |
下一步行动
- ✅ 创建 Gitea 仓库
video_probe - ✅ 初始化 Cargo 项目
- ✅ 实现核心功能
- ✅ 添加测试
- ✅ 编写文档
- ✅ 发布到 crates.io(可选)