Files
video_probe/VIDEO_PROBE_RUST_DEVELOPMENT.md
accusys f3e2d2dca7 Initial implementation of video_probe (Rust)
Core modules:
- probe.rs: ffprobe execution logic
- parser.rs: JSON parsing logic
- output.rs: Output formatting
- lib.rs: Library interface
- main.rs: CLI entry point

Features:
- Extract video metadata using ffprobe
- Parse video/audio/subtitle streams
- Save to JSON file
- Console summary output

Documentation:
- Added QUICKSTART.md
- Added ENVIRONMENT_SETUP_REPORT.md
2026-03-07 10:10:19 +08:00

16 KiB
Raw Blame History

video_probe (Rust) - 开发计划

项目概述

将 Python 版本的 video_probe.py 重写为 Rust 版本,作为独立的 Gitea 仓库。

目标: 高性能、跨平台的视频元数据提取工具

输入: 视频文件路径
输出: <video_name>.probe.json 文件


功能需求

核心功能

  1. 使用 ffprobe 提取视频元数据
  2. 解析 JSON 输出
  3. 提取格式信息format
  4. 提取视频流信息video stream
  5. 提取音频流信息audio streams
  6. 提取字幕流信息subtitle streams
  7. 提取其他流信息other streams
  8. 保存为格式化的 JSON 文件
  9. 命令行参数解析
  10. 友好的控制台输出

高级功能(可选)

  • 批量处理多个视频文件
  • 递归扫描目录
  • 自定义输出路径
  • 输出格式选项JSON/YAML/TOML
  • 并行处理
  • 进度条显示
  • 错误容忍模式(跳过失败文件)

技术栈

Rust 依赖

核心依赖

[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
chrono = { version = "0.4", features = ["serde"] }
anyhow = "1.0"
thiserror = "1.0"

命令行工具

clap = { version = "4.0", features = ["derive"] }

可选增强

indicatif = "0.17"  # 进度条
rayon = "1.8"       # 并行处理
walkdir = "2.4"     # 目录遍历

外部依赖

  • ffprobe: 系统需安装 FFmpeg与 Python 版本相同)

项目结构

video_probe/
├── Cargo.toml
├── Cargo.lock
├── README.md
├── LICENSE
├── .gitignore
├── src/
│   ├── main.rs           # 入口点
│   ├── lib.rs            # 库接口
│   ├── probe.rs          # ffprobe 执行逻辑
│   ├── parser.rs         # JSON 解析逻辑
│   ├── metadata.rs       # 元数据结构定义
│   ├── output.rs         # 输出格式化
│   └── error.rs          # 错误处理
├── tests/
│   ├── integration_test.rs
│   └── fixtures/
│       └── sample.mp4
└── docs/
    ├── USAGE.md
    └── DEVELOPMENT.md

开发步骤

阶段 1: 项目初始化Day 1

1.1 创建 Cargo 项目

cargo new video_probe
cd video_probe

1.2 配置 Cargo.toml

[package]
name = "video_probe"
version = "0.1.0"
edition = "2021"
authors = ["Your Name <your.email@example.com>"]
description = "Extract video metadata using ffprobe"
license = "MIT"

[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
chrono = { version = "0.4", features = ["serde"] }
anyhow = "1.0"
thiserror = "1.0"
clap = { version = "4.0", features = ["derive"] }

[dev-dependencies]
tempfile = "3.8"

1.3 创建基础文件结构

mkdir -p src tests docs
touch src/{lib.rs,probe.rs,parser.rs,metadata.rs,output.rs,error.rs}

阶段 2: 核心数据结构Day 1-2

2.1 定义元数据结构(src/metadata.rs

use serde::{Deserialize, Serialize};
use chrono::{DateTime, Utc};

#[derive(Debug, Serialize, Deserialize)]
pub struct VideoMetadata {
    pub video_path: String,
    pub probed_at: DateTime<Utc>,
    pub format: FormatInfo,
    pub video_stream: Option<VideoStream>,
    pub audio_streams: Vec<AudioStream>,
    pub subtitle_streams: Vec<SubtitleStream>,
    pub other_streams: Vec<OtherStream>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct FormatInfo {
    pub filename: Option<String>,
    pub format_name: Option<String>,
    pub format_long_name: Option<String>,
    pub duration: f64,
    pub size: u64,
    pub bit_rate: u64,
    pub probe_score: Option<i32>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub tags: Option<serde_json::Value>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct VideoStream {
    pub index: i32,
    pub codec_name: Option<String>,
    pub codec_long_name: Option<String>,
    pub profile: Option<String>,
    pub level: Option<i32>,
    pub width: i32,
    pub height: i32,
    pub coded_width: Option<i32>,
    pub coded_height: Option<i32>,
    pub aspect_ratio: Option<String>,
    pub pix_fmt: Option<String>,
    pub field_order: Option<String>,
    pub r_frame_rate: Option<String>,
    pub avg_frame_rate: Option<String>,
    pub time_base: Option<String>,
    pub start_pts: Option<i64>,
    pub start_time: f64,
    pub duration: Option<f64>,
    pub bit_rate: Option<u64>,
    pub nb_frames: Option<u64>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub tags: Option<serde_json::Value>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct AudioStream {
    pub index: i32,
    pub codec_name: Option<String>,
    pub codec_long_name: Option<String>,
    pub profile: Option<String>,
    pub channels: i32,
    pub channel_layout: Option<String>,
    pub sample_rate: Option<String>,
    pub sample_fmt: Option<String>,
    pub bit_rate: Option<u64>,
    pub duration: Option<f64>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub tags: Option<serde_json::Value>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct SubtitleStream {
    pub index: i32,
    pub codec_name: Option<String>,
    pub language: Option<String>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub tags: Option<serde_json::Value>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct OtherStream {
    pub index: i32,
    pub codec_type: String,
    pub codec_name: Option<String>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub tags: Option<serde_json::Value>,
}

2.2 定义错误类型(src/error.rs

use thiserror::Error;

#[derive(Debug, Error)]
pub enum ProbeError {
    #[error("Video file not found: {0}")]
    FileNotFound(String),
    
    #[error("Failed to execute ffprobe: {0}")]
    FfprobeExecution(#[from] std::io::Error),
    
    #[error("Failed to parse ffprobe output: {0}")]
    ParseError(#[from] serde_json::Error),
    
    #[error("ffprobe returned non-zero exit code: {0}")]
    FfprobeFailed(String),
    
    #[error("No video stream found")]
    NoVideoStream,
}

阶段 3: ffprobe 执行逻辑Day 2-3

3.1 实现 ffprobe 调用(src/probe.rs

use std::process::Command;
use anyhow::Result;
use crate::error::ProbeError;

pub fn run_ffprobe(video_path: &str) -> Result<String> {
    // 检查文件是否存在
    if !std::path::Path::new(video_path).exists() {
        return Err(ProbeError::FileNotFound(video_path.to_string()).into());
    }
    
    // 执行 ffprobe
    let output = Command::new("ffprobe")
        .args(&[
            "-v", "quiet",
            "-print_format", "json",
            "-show_format",
            "-show_streams",
            video_path
        ])
        .output()?;
    
    // 检查退出码
    if !output.status.success() {
        let stderr = String::from_utf8_lossy(&output.stderr);
        return Err(ProbeError::FfprobeFailed(stderr.to_string()).into());
    }
    
    // 返回 JSON 输出
    let stdout = String::from_utf8(output.stdout)?;
    Ok(stdout)
}

3.2 实现并行版本(可选)

use rayon::prelude::*;

pub fn probe_videos_parallel(video_paths: &[&str]) -> Vec<Result<VideoMetadata>> {
    video_paths.par_iter()
        .map(|path| probe_video(path))
        .collect()
}

阶段 4: JSON 解析逻辑Day 3

4.1 实现 JSON 解析(src/parser.rs

use serde_json::Value;
use anyhow::Result;
use crate::metadata::*;

#[derive(Debug, Deserialize)]
struct FfprobeOutput {
    format: Option<Value>,
    streams: Option<Vec<Value>>,
}

pub fn parse_ffprobe_json(json_str: &str, video_path: &str) -> Result<VideoMetadata> {
    let ffprobe_data: FfprobeOutput = serde_json::from_str(json_str)?;
    
    let mut metadata = VideoMetadata {
        video_path: std::fs::canonicalize(video_path)?
            .to_string_lossy()
            .to_string(),
        probed_at: chrono::Utc::now(),
        format: FormatInfo::default(),
        video_stream: None,
        audio_streams: Vec::new(),
        subtitle_streams: Vec::new(),
        other_streams: Vec::new(),
    };
    
    // 解析 format
    if let Some(fmt) = ffprobe_data.format {
        metadata.format = parse_format(&fmt)?;
    }
    
    // 解析 streams
    if let Some(streams) = ffprobe_data.streams {
        for stream in streams {
            let codec_type = stream.get("codec_type")
                .and_then(|v| v.as_str())
                .unwrap_or("");
            
            match codec_type {
                "video" => {
                    if metadata.video_stream.is_none() {
                        metadata.video_stream = Some(parse_video_stream(&stream)?);
                    }
                }
                "audio" => {
                    metadata.audio_streams.push(parse_audio_stream(&stream)?);
                }
                "subtitle" => {
                    metadata.subtitle_streams.push(parse_subtitle_stream(&stream)?);
                }
                _ => {
                    metadata.other_streams.push(parse_other_stream(&stream)?);
                }
            }
        }
    }
    
    Ok(metadata)
}

fn parse_format(fmt: &Value) -> Result<FormatInfo> {
    Ok(FormatInfo {
        filename: fmt.get("filename").and_then(|v| v.as_str()).map(String::from),
        format_name: fmt.get("format_name").and_then(|v| v.as_str()).map(String::from),
        format_long_name: fmt.get("format_long_name").and_then(|v| v.as_str()).map(String::from),
        duration: fmt.get("duration").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0.0),
        size: fmt.get("size").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0),
        bit_rate: fmt.get("bit_rate").and_then(|v| v.as_str()).and_then(|s| s.parse().ok()).unwrap_or(0),
        probe_score: fmt.get("probe_score").and_then(|v| v.as_i64()).map(|i| i as i32),
        tags: fmt.get("tags").cloned(),
    })
}

// 类似地实现其他 parse_* 函数...

阶段 5: 输出和格式化Day 3-4

5.1 实现输出逻辑(src/output.rs

use std::path::Path;
use anyhow::Result;
use crate::metadata::VideoMetadata;

pub fn save_metadata(video_path: &str, metadata: &VideoMetadata) -> Result<String> {
    let video_path = Path::new(video_path);
    let video_dir = video_path.parent().unwrap_or(Path::new("."));
    let video_name = video_path.file_stem()
        .and_then(|s| s.to_str())
        .unwrap_or("unknown");
    
    let output_file = video_dir.join(format!("{}.probe.json", video_name));
    
    let json = serde_json::to_string_pretty(metadata)?;
    std::fs::write(&output_file, json)?;
    
    Ok(output_file.to_string_lossy().to_string())
}

pub fn print_summary(metadata: &VideoMetadata) {
    println!("✓ Video probed successfully!\n");
    
    if let Some(ref filename) = metadata.format.filename {
        println!("File: {}", filename);
    }
    
    if let Some(ref format_name) = metadata.format.format_long_name {
        println!("Format: {}", format_name);
    }
    
    println!("Duration: {:.2} seconds", metadata.format.duration);
    println!("Size: {:.2} MB", metadata.format.size as f64 / 1024.0 / 1024.0);
    println!("Bit rate: {:.0} kbps", metadata.format.bit_rate as f64 / 1000.0);
    
    if let Some(ref vs) = metadata.video_stream {
        println!("\nVideo Stream:");
        println!("  Codec: {} ({:?})", 
                 vs.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
                 vs.profile);
        println!("  Resolution: {}x{}", vs.width, vs.height);
        println!("  Frame rate: {}", vs.r_frame_rate.as_ref().unwrap_or(&"N/A".to_string()));
        println!("  Pixel format: {}", vs.pix_fmt.as_ref().unwrap_or(&"N/A".to_string()));
    }
    
    if !metadata.audio_streams.is_empty() {
        println!("\nAudio Streams: {}", metadata.audio_streams.len());
        for (i, audio) in metadata.audio_streams.iter().enumerate() {
            println!("  [{}] {} - {} channels @ {} Hz",
                     i + 1,
                     audio.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
                     audio.channels,
                     audio.sample_rate.as_ref().unwrap_or(&"N/A".to_string()));
        }
    }
    
    if !metadata.subtitle_streams.is_empty() {
        println!("\nSubtitle Streams: {}", metadata.subtitle_streams.len());
        for (i, sub) in metadata.subtitle_streams.iter().enumerate() {
            println!("  [{}] {} ({:?})",
                     i + 1,
                     sub.codec_name.as_ref().unwrap_or(&"N/A".to_string()),
                     sub.language);
        }
    }
}

阶段 6: 命令行界面Day 4

6.1 实现主程序(src/main.rs

use clap::Parser;
use anyhow::Result;

mod probe;
mod parser;
mod metadata;
mod output;
mod error;

#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
struct Args {
    /// Video file path
    video_path: String,
    
    /// Output directory (default: same as video file)
    #[arg(short, long)]
    output: Option<String>,
    
    /// Verbose output
    #[arg(short, long)]
    verbose: bool,
}

fn main() -> Result<()> {
    let args = Args::parse();
    
    println!("Probing video: {}", args.video_path);
    println!("{}", "=".repeat(60));
    
    // 执行 ffprobe
    let json_output = probe::run_ffprobe(&args.video_path)?;
    
    // 解析 JSON
    let metadata = parser::parse_ffprobe_json(&json_output, &args.video_path)?;
    
    // 保存到文件
    let output_file = output::save_metadata(&args.video_path, &metadata)?;
    
    // 打印摘要
    output::print_summary(&metadata);
    
    println!("\n✓ Metadata saved to: {}", output_file);
    println!("{}", "=".repeat(60));
    
    Ok(())
}

阶段 7: 测试Day 5

7.1 单元测试

#[cfg(test)]
mod tests {
    use super::*;
    
    #[test]
    fn test_parse_format() {
        let json = r#"{
            "filename": "test.mp4",
            "format_name": "mov,mp4",
            "duration": "120.5",
            "size": "52428800",
            "bit_rate": "3473408"
        }"#;
        
        let value: serde_json::Value = serde_json::from_str(json).unwrap();
        let format = parse_format(&value).unwrap();
        
        assert_eq!(format.filename, Some("test.mp4".to_string()));
        assert_eq!(format.duration, 120.5);
    }
}

7.2 集成测试

#[test]
fn test_probe_video() {
    let video_path = "tests/fixtures/sample.mp4";
    let result = probe_video(video_path);
    assert!(result.is_ok());
}

阶段 8: 文档和发布Day 5-6

8.1 编写 README.md

# video_probe

Extract video metadata using ffprobe (Rust version)

## Installation

```bash
cargo install video_probe

Usage

video_probe video.mp4

Features

  • Fast and efficient (written in Rust)
  • Cross-platform (Linux, macOS, Windows)
  • Comprehensive metadata extraction
  • JSON output format
  • User-friendly console output

#### 8.2 发布到 crates.io

```bash
cargo publish

开发时间表

阶段 任务 预计时间
1 项目初始化 0.5 天
2 数据结构定义 1 天
3 ffprobe 执行逻辑 1 天
4 JSON 解析逻辑 1 天
5 输出和格式化 0.5 天
6 命令行界面 0.5 天
7 测试 1 天
8 文档和发布 0.5 天
总计 6 天

与 Python 版本的对比

特性 Python 版本 Rust 版本 优势
性能 中等 2-10x 更快
内存使用 较高 更高效
启动时间 即时启动
部署 需要 Python 单二进制 更简单
跨平台 相同
依赖管理 pip Cargo Cargo 更好
类型安全 编译时检查
并发支持 有限 优秀 Rayon 并行
错误处理 异常 Result 更明确

下一步行动

  1. 创建 Gitea 仓库 video_probe
  2. 初始化 Cargo 项目
  3. 实现核心功能
  4. 添加测试
  5. 编写文档
  6. 发布到 crates.io可选

参考资料