Initial commit: Momentry Core v0.1
- Rust-based digital asset management system - Video analysis: ASR, OCR, YOLO, Face, Pose - RAG capabilities with Qdrant vector database - Multi-database support: PostgreSQL, Redis, MongoDB - Monitoring system with launchd plists - n8n workflow automation integration
This commit is contained in:
366
monitor/MONITORING.md
Normal file
366
monitor/MONITORING.md
Normal file
@@ -0,0 +1,366 @@
|
||||
# Momentry 監控系統
|
||||
|
||||
## 概述
|
||||
|
||||
Momentry 監控系統採用七層架構,涵蓋從外部服務到本地存儲的全部監控需求。
|
||||
|
||||
---
|
||||
|
||||
## 監控架構 (七層)
|
||||
|
||||
```
|
||||
Layer 1: External 監控 - DDNS、網關、互聯網連接
|
||||
Layer 2: Service 監控 - 15 個 momentry 服務
|
||||
Layer 3: Workflow 監控 - n8n Workflow 狀態
|
||||
Layer 4: Portal 監控 - WordPress 頁面與帳號
|
||||
Layer 5: Database 監控 - PostgreSQL/Redis/Qdrant/MariaDB
|
||||
Layer 6: 使用者監控 - 連線/本機使用者/異常檢測
|
||||
Layer 7: Storage 監控 - 冷溫熱分層/歸檔/檔案註冊
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 快速開始
|
||||
|
||||
### 查看監控狀態
|
||||
|
||||
```bash
|
||||
cd /Users/accusys/momentry_core_0.1/monitor
|
||||
./control/monitor_control.sh status
|
||||
```
|
||||
|
||||
### 執行全面檢查
|
||||
|
||||
```bash
|
||||
./control/monitor_control.sh check all
|
||||
```
|
||||
|
||||
### 查看特定層級
|
||||
|
||||
```bash
|
||||
./control/monitor_control.sh check service # Layer 2
|
||||
./control/monitor_control.sh check workflow # Layer 3
|
||||
./control/monitor_control.sh check portal # Layer 4
|
||||
./control/monitor_control.sh check database # Layer 5
|
||||
./control/monitor_control.sh check users # Layer 6
|
||||
./control/monitor_control.sh check storage # Layer 7
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 各層監控說明
|
||||
|
||||
### Layer 1: External 監控
|
||||
|
||||
監控外部依賴服務的可用性。
|
||||
|
||||
**監控項目**:
|
||||
- DDNS 域名解析 (momentry.ddns.net)
|
||||
- 網關連通性
|
||||
- 外部 API 連接
|
||||
|
||||
**腳本**: `service/external_monitor.sh`
|
||||
|
||||
### Layer 2: Service 監控
|
||||
|
||||
監控 15 個 momentry 服務的運行狀態。
|
||||
|
||||
**監控項目**:
|
||||
| 服務 | Port | 狀態檢查 |
|
||||
|------|------|----------|
|
||||
| PostgreSQL | 5432 | pg_isready |
|
||||
| Redis | 6379 | redis-cli ping |
|
||||
| MariaDB | 3306 | mysql ping |
|
||||
| n8n | 5678 | HTTP GET |
|
||||
| Caddy | 443 | HTTPS |
|
||||
| Gitea | 3000 | HTTP |
|
||||
| SFTPGo | 2222 | SSH |
|
||||
| Ollama | 11434 | API |
|
||||
| Qdrant | 6333 | API |
|
||||
| PHP-FPM | - | Process |
|
||||
| RustDesk | 21116 | Port |
|
||||
| MongoDB | 27017 | Mongo ping |
|
||||
|
||||
**腳本**: `service/health_check.sh`
|
||||
|
||||
### Layer 3: n8n Workflow 監控
|
||||
|
||||
監控 n8n Workflow 的執行狀態和閒置分析。
|
||||
|
||||
**監控項目**:
|
||||
- Workflow 數量與狀態
|
||||
- 執行次數與結果
|
||||
- 閒置 Workflow 識別
|
||||
- 改善建議生成
|
||||
|
||||
**閒置定義**: 無排程 AND 無 API 觸發 AND 超過 30 天未執行
|
||||
|
||||
**腳本**:
|
||||
- `workflow/n8n_workflow_monitor.sh`
|
||||
- `workflow/idle_analyzer.sh`
|
||||
|
||||
### Layer 4: WordPress Portal 監控
|
||||
|
||||
監控 WordPress 頁面可訪問性和內部帳號。
|
||||
|
||||
**監控項目**:
|
||||
- 首頁/登入頁可訪問性
|
||||
- 響應時間
|
||||
- 用戶列表與角色
|
||||
- 新增/刪除用戶
|
||||
|
||||
**腳本**:
|
||||
- `portal/page_monitor.sh`
|
||||
- `portal/account_monitor.sh`
|
||||
|
||||
### Layer 5: Database 監控
|
||||
|
||||
監控所有資料庫的健康狀態和性能指標。
|
||||
|
||||
**PostgreSQL**:
|
||||
- 表數量、行數、大小
|
||||
- 死元組、慢查詢
|
||||
- 表結構變更
|
||||
|
||||
**Redis**:
|
||||
- 連線數、內存使用
|
||||
- 命中率、操作數
|
||||
- 客戶端列表
|
||||
|
||||
**Qdrant**:
|
||||
- Collection 列表
|
||||
- Points 數、向量維度
|
||||
- 磁盤使用
|
||||
|
||||
**MariaDB**:
|
||||
- 連線數、緩衝池
|
||||
- WordPress 表結構
|
||||
|
||||
**腳本**:
|
||||
- `database/postgres_monitor.sh`
|
||||
- `database/redis_monitor.sh`
|
||||
- `database/qdrant_monitor.sh`
|
||||
- `database/mariadb_monitor.sh`
|
||||
|
||||
### Layer 6: 使用者監控
|
||||
|
||||
監控連線使用者和本機使用者的活動。
|
||||
|
||||
**連線使用者**:
|
||||
- SSH 登入與命令
|
||||
- Web 服務登入 (n8n, Gitea, WP)
|
||||
- 資料庫連線
|
||||
- SFTP 傳輸
|
||||
|
||||
**本機使用者**:
|
||||
- 系統登入
|
||||
- sudo 使用記錄
|
||||
- 服務帳戶活動
|
||||
- 異常檢測
|
||||
|
||||
**腳本**:
|
||||
- `users/session_tracker.sh`
|
||||
- `users/login_monitor.sh`
|
||||
- `users/sudo_tracker.sh`
|
||||
- `users/anomaly_detector.sh`
|
||||
|
||||
### Layer 7: Storage 架構
|
||||
|
||||
管理數據的冷溫熱分層和歸檔策略。
|
||||
|
||||
**目錄結構**:
|
||||
```
|
||||
/Users/accusys/momentry/
|
||||
├── var/ # 服務數據 (熱)
|
||||
├── etc/ # 配置 (溫)
|
||||
├── log/ # 日誌 (溫)
|
||||
├── data/ # 用戶數據
|
||||
│ ├── family/ # 家庭集群
|
||||
│ ├── work/ # 工作集群
|
||||
│ ├── wordpress/ # WP 隔離
|
||||
│ └── shared/ # 共享
|
||||
├── backup/ # 備份
|
||||
│ ├── daily/ # 每日備份 (保留 30 天)
|
||||
│ ├── weekly/ # 每週備份 (保留 12 週)
|
||||
│ ├── monthly/ # 每月備份 (保留 12 個月)
|
||||
│ └── archive/ # 歸檔 (保留 12 個月+)
|
||||
└── tmp/ # 臨時
|
||||
```
|
||||
|
||||
**分層標準**:
|
||||
| 等級 | 條件 | 存放 |
|
||||
|------|------|------|
|
||||
| 熱 | 7天內訪問 > 10次 | NVMe |
|
||||
| 溫 | 30天內訪問 > 1次 | RAID |
|
||||
| 冷 | 90天未訪問 | Object Storage |
|
||||
|
||||
**備份溫冷分層**:
|
||||
| 等級 | 保留時間 | 用途 |
|
||||
|------|---------|------|
|
||||
| daily | 7天 | 快速恢復 |
|
||||
| weekly | 30天 | 標準恢復 |
|
||||
| monthly | 365天 | 長期歸檔 |
|
||||
| archive | >365天 | 法規遵循 |
|
||||
|
||||
**腳本**:
|
||||
- `storage/storage_manager.sh` - 存儲管理
|
||||
- `storage/backup_monitor.sh` - 備份監控與溫冷轉移
|
||||
- `storage/migration_engine.sh` - 數據遷移
|
||||
- `storage/file_registry.py` - 檔案註冊
|
||||
|
||||
---
|
||||
|
||||
## 配置
|
||||
|
||||
### 主配置
|
||||
|
||||
編輯 `config/monitor_config.yaml`:
|
||||
|
||||
```yaml
|
||||
monitoring:
|
||||
enabled: true
|
||||
check_interval: 300 # 秒
|
||||
|
||||
service:
|
||||
enabled: true
|
||||
services:
|
||||
- postgresql
|
||||
- redis
|
||||
- mariadb
|
||||
- n8n
|
||||
- caddy
|
||||
- gitea
|
||||
- sftpgo
|
||||
- ollama
|
||||
- qdrant
|
||||
- php
|
||||
- rustdesk
|
||||
|
||||
workflow:
|
||||
enabled: true
|
||||
idle_threshold_days: 30
|
||||
|
||||
portal:
|
||||
enabled: true
|
||||
|
||||
database:
|
||||
enabled: true
|
||||
|
||||
users:
|
||||
enabled: true
|
||||
|
||||
storage:
|
||||
enabled: false # 獨立實現
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 數據庫表
|
||||
|
||||
監控數據存儲在 PostgreSQL `momentry` 資料庫。
|
||||
|
||||
**主要表**:
|
||||
- `monitor_services` - 服務健康狀態
|
||||
- `monitor_workflows` - n8n Workflow 監控
|
||||
- `monitor_databases` - 資料庫指標
|
||||
- `monitor_sessions` - 使用者會話
|
||||
- `monitor_logins` - 登入歷史
|
||||
- `monitor_anomalies` - 異常檢測
|
||||
- `file_registry` - 檔案註冊
|
||||
|
||||
**創建表**: `database/schema.sql`
|
||||
|
||||
---
|
||||
|
||||
## 報警規則
|
||||
|
||||
| 層級 | 異常類型 | 等級 | 處理 |
|
||||
|------|----------|------|------|
|
||||
| Service | 服務宕機 | Critical | 記錄 |
|
||||
| Service | 響應過慢 | Warning | 記錄 |
|
||||
| Workflow | 閒置 > 30天 | Info | 記錄 |
|
||||
| Workflow | 連續失敗 | Critical | 記錄 |
|
||||
| Portal | 頁面不可訪問 | Critical | 記錄 |
|
||||
| Database | 表結構變更 | Critical | 記錄 |
|
||||
| Database | 連線過載 | Warning | 記錄 |
|
||||
| Users | 暴力破解 | Critical | 記錄 |
|
||||
| Users | 異常登入 | Warning | 記錄 |
|
||||
|
||||
**異常處理**: 僅記錄到資料庫,後續分析
|
||||
|
||||
---
|
||||
|
||||
## 維護
|
||||
|
||||
### 手動執行監控
|
||||
|
||||
```bash
|
||||
# 單次檢查
|
||||
./control/monitor_control.sh check service
|
||||
|
||||
# 持續監控 (每 5 分鐘)
|
||||
./control/monitor_control.sh monitor
|
||||
```
|
||||
|
||||
### 查看歷史
|
||||
|
||||
```bash
|
||||
# 查看服務狀態
|
||||
psql -U accusys -h localhost -d momentry -c "SELECT * FROM monitor_services ORDER BY checked_at DESC LIMIT 10;"
|
||||
|
||||
# 查看異常
|
||||
psql -U accusys -h localhost -d momentry -c "SELECT * FROM monitor_anomalies WHERE detected_at > NOW() - INTERVAL '24 hours';"
|
||||
```
|
||||
|
||||
### 清理歷史數據
|
||||
|
||||
```bash
|
||||
# 保留 30 天
|
||||
psql -U accusys -h localhost -d momentry -c "DELETE FROM monitor_services WHERE checked_at < NOW() - INTERVAL '30 days';"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 文件結構
|
||||
|
||||
```
|
||||
monitor/
|
||||
├── MONITORING.md # 本文件
|
||||
├── config/
|
||||
│ └── monitor_config.yaml # 配置文件
|
||||
├── control/
|
||||
│ └── monitor_control.sh # 控制腳本
|
||||
├── service/
|
||||
│ ├── health_check.sh # 服務健康檢查
|
||||
│ └── external_monitor.sh # 外部監控
|
||||
├── workflow/
|
||||
│ ├── n8n_workflow_monitor.sh
|
||||
│ └── idle_analyzer.sh
|
||||
├── portal/
|
||||
│ ├── page_monitor.sh
|
||||
│ └── account_monitor.sh
|
||||
├── database/
|
||||
│ ├── schema.sql # 數據庫表
|
||||
│ ├── postgres_monitor.sh
|
||||
│ ├── redis_monitor.sh
|
||||
│ ├── qdrant_monitor.sh
|
||||
│ └── mariadb_monitor.sh
|
||||
├── users/
|
||||
│ ├── session_tracker.sh
|
||||
│ ├── login_monitor.sh
|
||||
│ ├── sudo_tracker.sh
|
||||
│ └── anomaly_detector.sh
|
||||
├── storage/
|
||||
│ ├── storage_manager.sh
|
||||
│ └── migration_engine.sh
|
||||
└── docs/
|
||||
└── TROUBLESHOOTING.md
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 相關文檔
|
||||
|
||||
- [Storage 架構設計規範](./storage/STORAGE_SPEC.md)
|
||||
- [WordPress 監控](./wordpress/MONITORING.md)
|
||||
- [異常檢測規則](./users/ANOMALY_RULES.md)
|
||||
388
monitor/SKILL_TROUBLESHOOTING.md
Normal file
388
monitor/SKILL_TROUBLESHOOTING.md
Normal file
@@ -0,0 +1,388 @@
|
||||
# Momentry 服務故障排除 Skill
|
||||
|
||||
## 概述
|
||||
|
||||
此 Skill 提供常見服務問題的快速診斷參考,包含問題類型、檢查命令、日誌位置和常見解決方案。
|
||||
|
||||
---
|
||||
|
||||
## 快速檢查清單
|
||||
|
||||
### 1. 服務狀態檢查
|
||||
|
||||
```bash
|
||||
# 查看所有 momentry 服務狀態
|
||||
launchctl list | grep com.momentry
|
||||
|
||||
# 或執行健康檢查
|
||||
/Users/accusys/momentry_core_0.1/monitor/service/health_check.sh
|
||||
```
|
||||
|
||||
### 2. 端口佔用檢查
|
||||
|
||||
```bash
|
||||
# 檢查特定端口
|
||||
lsof -i :<PORT>
|
||||
|
||||
# 常用端口對照
|
||||
# PostgreSQL: 5432
|
||||
# Redis: 6379
|
||||
# MariaDB: 3306
|
||||
# n8n: 8085 (內部), 5678 (舊配置)
|
||||
# Caddy Admin: 2019
|
||||
# Ollama: 11434
|
||||
# Qdrant: 6333
|
||||
# SFTPGo: 8080 (HTTP), 2022 (SFTP)
|
||||
# Gitea: 3000
|
||||
# PHP-FPM: 9000
|
||||
# RustDesk: 21115-21119
|
||||
# MongoDB: 27017
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 服務詳細診斷
|
||||
|
||||
### PostgreSQL
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/postgresql/`
|
||||
**數據目錄**: `/Users/accusys/momentry/var/postgresql/`
|
||||
**日誌**: `/Users/accusys/momentry/log/postgresql.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| 連線失敗 | `pg_isready -h localhost -p 5432 -U accusys` | 檢查 plist 是否載入 |
|
||||
| 認證錯誤 | `psql -U accusys -h localhost -d momentry -c "SELECT 1"` | 檢查 pg_hba.conf |
|
||||
| 效能問題 | `psql -U accusys -h localhost -d momentry -c "SELECT * FROM pg_stat_activity"` | 檢查連線數 |
|
||||
| 資料庫不存在 | `psql -U accusys -h localhost -l` | 創建數據庫 |
|
||||
|
||||
```bash
|
||||
# 創建數據庫
|
||||
createdb -U accusys momentry
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Redis
|
||||
|
||||
**配置文件**: `/opt/homebrew/etc/redis.conf`
|
||||
**數據目錄**: `/Users/accusys/momentry/var/redis/`
|
||||
**日誌**: `/Users/accusys/momentry/log/redis.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| 連線失敗 | `redis-cli -a accusys ping` | 檢查密碼是否正確 |
|
||||
| 認證錯誤 | `redis-cli -a accusys AUTH accusys` | 檢查密碼配置 |
|
||||
| 記憶體過高 | `redis-cli -a accusys INFO memory` | 檢查 keys 數量 |
|
||||
| 持久化失敗 | `redis-cli -a accusys LASTSAVE` | 檢查 RDB 配置 |
|
||||
|
||||
```bash
|
||||
# 常用操作
|
||||
redis-cli -a accusys SAVE # 觸發保存
|
||||
redis-cli -a accusys FLUSHALL # 清空所有 keys
|
||||
redis-cli -a accusys KEYS '*' # 查看所有 keys
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### MariaDB
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/mariadb/`
|
||||
**數據目錄**: `/Users/accusys/momentry/var/mariadb/`
|
||||
**日誌**: `/Users/accusys/momentry/log/mariadb.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| 連線失敗 | `mysql -u accusys -e "SELECT 1"` | 檢查用戶權限 |
|
||||
| 拒絕訪問 | `mysql -u root -e "SELECT user FROM mysql.user"` | 檢查用戶配置 |
|
||||
| 效能問題 | `mysql -u accusys -e "SHOW PROCESSLIST"` | 檢查慢查詢 |
|
||||
|
||||
```bash
|
||||
# 常用操作
|
||||
mysql -u accusys -e "SHOW DATABASES"
|
||||
mysql -u accusys -e "SHOW TABLES" momentry
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### n8n
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/n8n/`
|
||||
**數據目錄**: `/Users/accusys/momentry/var/n8n/`
|
||||
**日誌**: `/Users/accusys/momentry/log/n8n-main.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| 網頁無法訪問 | `curl -s http://localhost:8085/` | 檢查端口是否正確 |
|
||||
| API 錯誤 | `curl -s http://localhost:8085/healthz` | 檢查服務狀態 |
|
||||
| Workflow 不執行 | 檢查 n8n log | 檢查 queue (Redis) 連線 |
|
||||
| 資料庫連線失敗 | `psql -U n8n -h localhost -d n8n -c "SELECT 1"` | 檢查 PostgreSQL |
|
||||
|
||||
**重要**: n8n 使用 PostgreSQL (非 SQLite),端口為 8085
|
||||
|
||||
```bash
|
||||
# 數據庫連線
|
||||
psql -U n8n -h localhost -d n8n
|
||||
|
||||
# 查看 users
|
||||
psql -U n8n -h localhost -d n8n -c "SELECT id, email, \"firstName\", \"roleSlug\" FROM \"user\";"
|
||||
|
||||
# 查看 workflows
|
||||
psql -U n8n -h localhost -d n8n -c "SELECT id, name, active FROM workflow_entity;"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Ollama
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/ollama/`
|
||||
**模型目錄**: `/Users/accusys/momentry/var/ollama/`
|
||||
**日誌**: `/Users/accusys/momentry/log/ollama.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| API 無回應 | `curl -s http://localhost:11434/api/tags` | 檢查服務是否啟動 |
|
||||
| 模型下載失敗 | `ollama list` | 重新下載模型 |
|
||||
| 記憶體不足 | `ollama list` | 檢查已加載模型 |
|
||||
|
||||
```bash
|
||||
# 常用操作
|
||||
ollama list # 列出模型
|
||||
ollama pull <model> # 下載模型
|
||||
ollama run <model> <prompt> # 運行模型
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Qdrant
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/qdrant/`
|
||||
**數據目錄**: `/Users/accusys/momentry/var/qdrant/`
|
||||
**日誌**: `/Users/accusys/momentry/log/qdrant.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| API 無回應 | `curl -s http://localhost:6333/collections` | 需要 API Key |
|
||||
| 認證失敗 | `curl -s -H "api-key: Test3200Test3200" http://localhost:6333/collections` | 檢查 API Key |
|
||||
| 效能問題 | `curl -s http://localhost:6333/cluster` | 檢查叢集狀態 |
|
||||
|
||||
```bash
|
||||
# 需要認證的指令
|
||||
curl -s -H "api-key: Test3200Test3200" http://localhost:6333/collections
|
||||
curl -s -H "api-key: Test3200Test3200" http://localhost:6333/points/count
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Caddy
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/caddy/`
|
||||
**數據目錄**: `/Users/accusys/momentry/var/caddy/`
|
||||
**日誌**: `/Users/accusys/momentry/log/caddy.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| Admin API 無回應 | `curl -s http://localhost:2019/config/` | 檢查 Caddy 進程 |
|
||||
| 網站無法訪問 | `curl -s -I https://localhost:443/` | 檢查證書配置 |
|
||||
| 代理失敗 | `curl -s http://localhost:2019/config/` | 檢查反向代理配置 |
|
||||
|
||||
```bash
|
||||
# 重新載入配置
|
||||
sudo launchctl unload /Library/LaunchDaemons/com.momentry.caddy.plist
|
||||
sudo launchctl load /Library/LaunchDaemons/com.momentry.caddy.plist
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Gitea
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/gitea/app.ini`
|
||||
**數據目錄**: `/Users/accusys/momentry/var/gitea/`
|
||||
**日誌**: `/Users/accusys/momentry/log/gitea.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| 網頁無法訪問 | `curl -s http://localhost:3000/` | 檢查服務狀態 |
|
||||
| Git 推送失敗 | `ssh git@localhost -p 22` | 檢查 SSH 配置 |
|
||||
| 資料庫連線 | `mysql -u gitea -p gitea_db -e "SELECT 1"` | 檢查 PostgreSQL |
|
||||
|
||||
---
|
||||
|
||||
### SFTPGo
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/sftpgo/sftpgo.json`
|
||||
**數據目錄**: `/Users/accusys/momentry/var/sftpgo/`
|
||||
**日誌**: `/Users/accusys/momentry/log/sftpgo.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| Web UI 無法訪問 | `curl -s http://localhost:8080/` | 檢查 HTTP 端口 |
|
||||
| SFTP 連線失敗 | `sftp -P 2022 user@localhost` | 檢查 SSH 密鑰 |
|
||||
| 認證失敗 | `curl -s http://localhost:8080/api/v2/info` | 檢查用戶配置 |
|
||||
|
||||
---
|
||||
|
||||
### PHP-FPM
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/php/`
|
||||
**日誌**: `/Users/accusys/momentry/log/php.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| PHP 無法執行 | `php -v` | 檢查 PHP 版本 |
|
||||
| FPM 無回應 | `curl -s http://localhost:9000/` | 檢查 FPM 端口 |
|
||||
| 500 錯誤 | `tail -100 /Users/accusys/momentry/log/php.error.log` | 檢查錯誤日誌 |
|
||||
|
||||
---
|
||||
|
||||
### RustDesk
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/rustdesk/`
|
||||
**日誌**: `/Users/accusys/momentry/log/rustdesk.*.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| 連線失敗 | `nc -z localhost 21116 && nc -z localhost 21117` | 檢查端口 |
|
||||
| ID 伺服器錯誤 | `pgrep -a hbbs` | 檢查 hbbs 進程 |
|
||||
| 中繼伺服器錯誤 | `pgrep -a hbbr` | 檢查 hbbr 進程 |
|
||||
|
||||
---
|
||||
|
||||
### MongoDB
|
||||
|
||||
**配置文件**: `/Users/accusys/momentry/etc/mongodb/`
|
||||
**數據目錄**: `/Users/accusys/momentry/var/mongodb/`
|
||||
**日誌**: `/Users/accusys/momentry/log/mongodb.log`
|
||||
|
||||
| 問題 | 檢查命令 | 解決方案 |
|
||||
|------|----------|----------|
|
||||
| 連線失敗 | `mongosh --quiet --eval "db.adminCommand('ping')"` | 需要認證 |
|
||||
| 認證錯誤 | `mongosh -u accusys -p Test3200Test3200 --authenticationDatabase admin` | 檢查用戶 |
|
||||
| 效能問題 | `mongosh -u accusys -p Test3200Test3200 --eval "db.serverStatus()"` | 檢查狀態 |
|
||||
|
||||
---
|
||||
|
||||
## 服務管理命令
|
||||
|
||||
### 啟動服務
|
||||
|
||||
```bash
|
||||
sudo launchctl load /Library/LaunchDaemons/com.momentry.<service>.plist
|
||||
```
|
||||
|
||||
### 停止服務
|
||||
|
||||
```bash
|
||||
sudo launchctl unload /Library/LaunchDaemons/com.momentry.<service>.plist
|
||||
```
|
||||
|
||||
### 重啟服務
|
||||
|
||||
```bash
|
||||
sudo launchctl unload /Library/LaunchDaemons/com.momentry.<service>.plist
|
||||
sudo launchctl load /Library/LaunchDaemons/com.momentry.<service>.plist
|
||||
```
|
||||
|
||||
### 查看服務日誌
|
||||
|
||||
```bash
|
||||
tail -f /Users/accusys/momentry/log/<service>.log
|
||||
tail -f /Users/accusys/momentry/log/<service>.error.log
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 服務端口對照表
|
||||
|
||||
| 服務 | 內部端口 | 外部端口 | Caddy 域名 |
|
||||
|------|---------|---------|------------|
|
||||
| PostgreSQL | 5432 | 5432 | - |
|
||||
| Redis | 6379 | 6379 | - |
|
||||
| MariaDB | 3306 | 3306 | - |
|
||||
| n8n | 8085 | 443 | n8n.momentry.ddns.net |
|
||||
| Ollama | 11434 | 11434 | - |
|
||||
| Qdrant | 6333 | 443 | qdrant.momentry.ddns.net |
|
||||
| Caddy Admin | 2019 | 2019 | - |
|
||||
| Gitea | 3000 | 443 | gitea.momentry.ddns.net |
|
||||
| SFTPGo | 8080/2022 | 443 | sftpgo.momentry.ddns.net |
|
||||
| PHP-FPM | 9000 | - | - |
|
||||
| MongoDB | 27017 | 27017 | - |
|
||||
| RustDesk | 21115-21119 | 21115-21119 | - |
|
||||
|
||||
---
|
||||
|
||||
## 常見問題快速修復
|
||||
|
||||
### 1. 服務無法啟動
|
||||
|
||||
```bash
|
||||
# 1. 檢查進程
|
||||
pgrep -a <service_name>
|
||||
|
||||
# 2. 檢查端口
|
||||
lsof -i :<PORT>
|
||||
|
||||
# 3. 檢查日誌
|
||||
tail -100 /Users/accusys/momentry/log/<service>.error.log
|
||||
|
||||
# 4. 重新載入
|
||||
sudo launchctl unload /Library/LaunchDaemons/com.momentry.<service>.plist
|
||||
sudo launchctl load /Library/LaunchDaemons/com.momentry.<service>.plist
|
||||
```
|
||||
|
||||
### 2. 認證失敗
|
||||
|
||||
```bash
|
||||
# 檢查用戶是否存在
|
||||
psql -U <db_user> -h localhost -d <db_name> -c "SELECT 1"
|
||||
|
||||
# 檢查密碼
|
||||
# 參考各服務的 INSTALL_*.md 文檔
|
||||
```
|
||||
|
||||
### 3. 效能問題
|
||||
|
||||
```bash
|
||||
# 檢查系統資源
|
||||
top -o cpu
|
||||
top -o mem
|
||||
|
||||
# 檢查磁盤空間
|
||||
df -h
|
||||
|
||||
# 檢查網絡連線
|
||||
netstat -an | grep ESTABLISHED
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 監控腳本位置
|
||||
|
||||
- 健康檢查: `/Users/accusys/momentry_core_0.1/monitor/service/health_check.sh`
|
||||
- PostgreSQL: `/Users/accusys/momentry_core_0.1/monitor/database/postgres_monitor.sh`
|
||||
- Redis: `/Users/accusys/momentry_core_0.1/monitor/database/redis_monitor.sh`
|
||||
- Qdrant: `/Users/accusys/momentry_core_0.1/monitor/database/qdrant_monitor.sh`
|
||||
- MongoDB: `/Users/accusys/momentry_core_0.1/monitor/database/mongodb_monitor.sh`
|
||||
- n8n Workflow: `/Users/accusys/momentry_core_0.1/monitor/workflow/n8n_workflow_monitor.sh`
|
||||
|
||||
---
|
||||
|
||||
## 密碼參考 (請修改為真實密碼)
|
||||
|
||||
| 服務 | 用戶 | 密碼 |
|
||||
|------|------|------|
|
||||
| PostgreSQL | accusys | (无密码) |
|
||||
| PostgreSQL | n8n | accusys |
|
||||
| Redis | - | accusys |
|
||||
| MariaDB | accusys | - |
|
||||
| n8n | - | (Web UI 配置) |
|
||||
| Qdrant | - | Test3200Test3200 |
|
||||
| MongoDB | accusys | Test3200Test3200 |
|
||||
|
||||
---
|
||||
|
||||
## 文檔位置
|
||||
|
||||
- 安裝文檔: `/Users/accusys/momentry_core_0.1/docs/INSTALL_*.md`
|
||||
- 服務管理: `/Users/accusys/momentry_core_0.1/docs/SERVICE_ADDITION_GUIDE.md`
|
||||
- Plist 模板: `/Users/accusys/momentry_core_0.1/momentry_runtime/plist/`
|
||||
503
monitor/config/monitor_config.yaml
Normal file
503
monitor/config/monitor_config.yaml
Normal file
@@ -0,0 +1,503 @@
|
||||
# Momentry 監控系統配置文件
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/config/monitor_config.yaml
|
||||
|
||||
monitoring:
|
||||
enabled: true
|
||||
check_interval: 300 # 秒 (5分鐘)
|
||||
|
||||
# 數據庫連接
|
||||
database:
|
||||
host: "localhost"
|
||||
port: 5432
|
||||
username: "accusys"
|
||||
password: "" # 使用 psql 連接
|
||||
name: "momentry"
|
||||
|
||||
# ============================================================
|
||||
# Layer 1: External 監控
|
||||
# ============================================================
|
||||
external:
|
||||
enabled: true
|
||||
check_interval: 60 # 秒
|
||||
targets:
|
||||
- name: "ddns"
|
||||
type: "ddns"
|
||||
host: "momentry.ddns.net"
|
||||
enabled: true
|
||||
- name: "gateway"
|
||||
type: "gateway"
|
||||
host: "192.168.110.1"
|
||||
enabled: true
|
||||
- name: "internet"
|
||||
type: "internet"
|
||||
host: "8.8.8.8"
|
||||
enabled: true
|
||||
|
||||
# ============================================================
|
||||
# Layer 2: Service 監控
|
||||
# ============================================================
|
||||
service:
|
||||
enabled: true
|
||||
check_interval: 300 # 秒
|
||||
services:
|
||||
- name: "postgresql"
|
||||
type: "postgres"
|
||||
port: 5432
|
||||
host: "localhost"
|
||||
check_cmd: "pg_isready -h localhost -p 5432 -U accusys"
|
||||
timeout: 5
|
||||
enabled: true
|
||||
|
||||
- name: "redis"
|
||||
type: "redis"
|
||||
port: 6379
|
||||
host: "localhost"
|
||||
password: "accusys"
|
||||
check_cmd: "redis-cli -a accusys ping"
|
||||
timeout: 5
|
||||
enabled: true
|
||||
|
||||
- name: "mariadb"
|
||||
type: "mariadb"
|
||||
port: 3306
|
||||
host: "localhost"
|
||||
check_cmd: "mysql -u root -e 'SELECT 1'"
|
||||
timeout: 5
|
||||
enabled: true
|
||||
|
||||
- name: "n8n"
|
||||
type: "http"
|
||||
port: 5678
|
||||
host: "localhost"
|
||||
check_url: "http://localhost:5678/"
|
||||
timeout: 10
|
||||
enabled: true
|
||||
|
||||
- name: "caddy"
|
||||
type: "http"
|
||||
port: 2019
|
||||
host: "localhost"
|
||||
check_url: "http://localhost:2019/config/"
|
||||
timeout: 10
|
||||
enabled: true
|
||||
|
||||
- name: "gitea"
|
||||
type: "http"
|
||||
port: 3000
|
||||
host: "localhost"
|
||||
check_url: "http://localhost:3000/"
|
||||
timeout: 10
|
||||
enabled: true
|
||||
|
||||
- name: "sftpgo"
|
||||
type: "http"
|
||||
port: 8080
|
||||
host: "localhost"
|
||||
timeout: 10
|
||||
enabled: true
|
||||
|
||||
- name: "ollama"
|
||||
type: "http"
|
||||
port: 11434
|
||||
host: "localhost"
|
||||
check_url: "http://localhost:11434/api/tags"
|
||||
timeout: 10
|
||||
enabled: true
|
||||
|
||||
- name: "qdrant"
|
||||
type: "http"
|
||||
port: 6333
|
||||
host: "localhost"
|
||||
check_url: "http://localhost:6333/collections"
|
||||
timeout: 10
|
||||
enabled: true
|
||||
|
||||
- name: "mongodb"
|
||||
type: "mongodb"
|
||||
port: 27017
|
||||
host: "localhost"
|
||||
timeout: 10
|
||||
enabled: true
|
||||
|
||||
- name: "php"
|
||||
type: "process"
|
||||
process_name: "php-fpm"
|
||||
enabled: true
|
||||
|
||||
- name: "node"
|
||||
type: "process"
|
||||
process_name: "node"
|
||||
enabled: true
|
||||
check_interval: 60
|
||||
version_lock: "22.x"
|
||||
locked_processes:
|
||||
- "n8n"
|
||||
description: "Node.js 運行環境 (n8n 專用)"
|
||||
|
||||
- name: "python"
|
||||
type: "process"
|
||||
process_name: "python3"
|
||||
enabled: true
|
||||
check_interval: 60
|
||||
version_lock: "3.11.14"
|
||||
scripts:
|
||||
- "/Users/accusys/momentry_core_0.1/scripts/asr_processor.py"
|
||||
- "/Users/accusys/momentry_core_0.1/scripts/thumbnail_extractor.py"
|
||||
description: "Python 運行環境 (Momentry 腳本專用)"
|
||||
|
||||
- name: "rustdesk_hbbs"
|
||||
type: "process"
|
||||
process_name: "hbbs"
|
||||
port: 21116
|
||||
enabled: true
|
||||
|
||||
- name: "rustdesk_hbbr"
|
||||
type: "process"
|
||||
process_name: "hbbr"
|
||||
port: 21117
|
||||
enabled: true
|
||||
|
||||
# ============================================================
|
||||
# Layer 3: n8n Workflow 監控
|
||||
# ============================================================
|
||||
workflow:
|
||||
enabled: true
|
||||
check_interval: 300 # 秒
|
||||
n8n:
|
||||
host: "http://localhost:5678"
|
||||
api_key: "" # 從環境變數或 n8n 獲取
|
||||
idle_threshold_days: 30
|
||||
suggestions:
|
||||
- type: "disable_idle"
|
||||
threshold_days: 30
|
||||
- type: "delete_unused"
|
||||
threshold_days: 90
|
||||
- type: "optimize_failures"
|
||||
failure_rate_threshold: 0.2
|
||||
|
||||
# ============================================================
|
||||
# Layer 4: WordPress Portal 監控
|
||||
# ============================================================
|
||||
portal:
|
||||
enabled: true
|
||||
check_interval: 300 # 秒
|
||||
wordpress:
|
||||
site_url: "https://wp.momentry.ddns.net"
|
||||
db_host: "localhost"
|
||||
db_name: "wordpress"
|
||||
db_user: "wp_user"
|
||||
db_password: "wp_password_123"
|
||||
page_monitoring:
|
||||
enabled: true
|
||||
pages:
|
||||
- url: "/"
|
||||
name: "homepage"
|
||||
- url: "/wp-login.php"
|
||||
name: "login_page"
|
||||
response_time_threshold_ms: 3000
|
||||
account_monitoring:
|
||||
enabled: true
|
||||
check_interval: 3600 # 小時
|
||||
alert_on_new_admin: true
|
||||
|
||||
# ============================================================
|
||||
# Layer 5: Database 監控
|
||||
# ============================================================
|
||||
database:
|
||||
enabled: true
|
||||
check_interval: 300 # 秒
|
||||
|
||||
postgres:
|
||||
enabled: true
|
||||
databases:
|
||||
- name: "momentry"
|
||||
- name: "gitea"
|
||||
- name: "n8n"
|
||||
- name: "video_register"
|
||||
schema_monitoring: true
|
||||
|
||||
redis:
|
||||
enabled: true
|
||||
password: "accusys"
|
||||
alert_thresholds:
|
||||
memory_percent: 80
|
||||
connected_clients: 100
|
||||
|
||||
qdrant:
|
||||
enabled: true
|
||||
collections_watch: ["*"] # 監控所有
|
||||
|
||||
mariadb:
|
||||
enabled: true
|
||||
databases:
|
||||
- name: "wordpress"
|
||||
|
||||
mongodb:
|
||||
enabled: true
|
||||
databases:
|
||||
- name: "momentry"
|
||||
- name: "admin"
|
||||
|
||||
# ============================================================
|
||||
# Layer 6: 使用者監控
|
||||
# ============================================================
|
||||
users:
|
||||
enabled: true
|
||||
check_interval: 60 # 秒
|
||||
|
||||
session_tracking:
|
||||
enabled: true
|
||||
track_ssh: true
|
||||
track_web: true
|
||||
track_db: true
|
||||
track_sftp: true
|
||||
|
||||
login_monitoring:
|
||||
enabled: true
|
||||
track_system: true
|
||||
track_wordpress: true
|
||||
track_n8n: true
|
||||
track_gitea: true
|
||||
|
||||
sudo_tracking:
|
||||
enabled: true
|
||||
|
||||
anomaly_detection:
|
||||
enabled: true
|
||||
rules:
|
||||
- type: "brute_force"
|
||||
threshold: 5
|
||||
window_seconds: 60
|
||||
severity: "critical"
|
||||
- type: "unusual_time"
|
||||
severity: "medium"
|
||||
allowed_hours: "08:00-22:00"
|
||||
- type: "idle_session"
|
||||
threshold_hours: 24
|
||||
severity: "low"
|
||||
|
||||
# ============================================================
|
||||
# Layer 7: Storage 監控 (獨立配置)
|
||||
# ============================================================
|
||||
storage:
|
||||
enabled: false # 獨立實現
|
||||
paths:
|
||||
hot: "/Users/accusys/momentry/data"
|
||||
warm: "/Volumes/RAID System/momentry/warm"
|
||||
cold: "/Volumes/Object Storage/momentry/archive"
|
||||
temp: "/Users/accusys/momentry/tmp"
|
||||
backup: "/Users/accusys/momentry/backup"
|
||||
clusters:
|
||||
- name: "family"
|
||||
path: "data/family"
|
||||
quota: "1TB"
|
||||
- name: "work"
|
||||
path: "data/work"
|
||||
quota: "2TB"
|
||||
- name: "wordpress"
|
||||
path: "data/wordpress"
|
||||
quota: "500GB"
|
||||
- name: "shared"
|
||||
path: "data/shared"
|
||||
quota: "1TB"
|
||||
migration:
|
||||
hot_to_warm_days: 7
|
||||
warm_to_cold_days: 90
|
||||
|
||||
# ============================================================
|
||||
# Layer 7: 備份監控
|
||||
# ============================================================
|
||||
backup:
|
||||
enabled: true
|
||||
check_interval: 3600 # 秒 (每小時檢查一次)
|
||||
|
||||
# 備份根目錄
|
||||
backup_root: "/Users/accusys/momentry/backup"
|
||||
|
||||
# 服務列表
|
||||
services:
|
||||
- name: "postgresql"
|
||||
enabled: true
|
||||
backup_type: "database"
|
||||
method: "pg_dump"
|
||||
schedule: "daily"
|
||||
retention:
|
||||
daily: 7
|
||||
weekly: 4
|
||||
monthly: 12
|
||||
|
||||
- name: "redis"
|
||||
enabled: true
|
||||
backup_type: "database"
|
||||
method: "rdb"
|
||||
schedule: "daily"
|
||||
retention:
|
||||
daily: 7
|
||||
weekly: 4
|
||||
|
||||
- name: "mariadb"
|
||||
enabled: true
|
||||
backup_type: "database"
|
||||
method: "mysqldump"
|
||||
schedule: "daily"
|
||||
retention:
|
||||
daily: 7
|
||||
weekly: 4
|
||||
|
||||
- name: "n8n"
|
||||
enabled: true
|
||||
backup_type: "full"
|
||||
method: "tar"
|
||||
schedule: "daily"
|
||||
retention:
|
||||
daily: 7
|
||||
weekly: 4
|
||||
|
||||
- name: "qdrant"
|
||||
enabled: true
|
||||
backup_type: "database"
|
||||
method: "snapshot"
|
||||
schedule: "daily"
|
||||
retention:
|
||||
daily: 7
|
||||
weekly: 4
|
||||
|
||||
- name: "gitea"
|
||||
enabled: true
|
||||
backup_type: "full"
|
||||
method: "gitea_dump"
|
||||
schedule: "weekly"
|
||||
retention:
|
||||
weekly: 4
|
||||
monthly: 12
|
||||
|
||||
- name: "mongodb"
|
||||
enabled: true
|
||||
backup_type: "database"
|
||||
method: "mongodump"
|
||||
schedule: "daily"
|
||||
retention:
|
||||
daily: 7
|
||||
weekly: 4
|
||||
|
||||
- name: "ollama"
|
||||
enabled: true
|
||||
backup_type: "config"
|
||||
method: "tar"
|
||||
schedule: "weekly"
|
||||
retention:
|
||||
weekly: 4
|
||||
monthly: 12
|
||||
|
||||
- name: "caddy"
|
||||
enabled: true
|
||||
backup_type: "config"
|
||||
method: "file"
|
||||
schedule: "weekly"
|
||||
retention:
|
||||
weekly: 4
|
||||
|
||||
- name: "sftpgo"
|
||||
enabled: true
|
||||
backup_type: "config"
|
||||
method: "file"
|
||||
schedule: "weekly"
|
||||
retention:
|
||||
weekly: 4
|
||||
|
||||
- name: "mongodb"
|
||||
enabled: true
|
||||
backup_type: "config"
|
||||
method: "file"
|
||||
schedule: "weekly"
|
||||
retention:
|
||||
weekly: 4
|
||||
|
||||
- name: "php"
|
||||
enabled: true
|
||||
backup_type: "config"
|
||||
method: "file"
|
||||
schedule: "weekly"
|
||||
retention:
|
||||
weekly: 4
|
||||
|
||||
# 溫冷轉移配置
|
||||
tiering:
|
||||
enabled: true
|
||||
tiering_interval: 86400 # 秒 (每天)
|
||||
rules:
|
||||
- from: "daily"
|
||||
to: "weekly"
|
||||
after_days: 7
|
||||
- from: "weekly"
|
||||
to: "monthly"
|
||||
after_days: 30
|
||||
- from: "monthly"
|
||||
to: "archive"
|
||||
after_days: 90
|
||||
|
||||
# 存儲閾值
|
||||
thresholds:
|
||||
backup_age_warning_days: 7
|
||||
backup_age_critical_days: 14
|
||||
storage_percent_warning: 80
|
||||
storage_percent_critical: 90
|
||||
|
||||
# 驗證
|
||||
verify:
|
||||
enabled: true
|
||||
verify_on_completion: true
|
||||
test_restore: false # 僅測試還原,不實際執行
|
||||
|
||||
# ============================================================
|
||||
# 通知配置
|
||||
# ============================================================
|
||||
notifications:
|
||||
enabled: true
|
||||
log_only: true # 僅記錄,不發送
|
||||
|
||||
# 日誌記錄
|
||||
log:
|
||||
enabled: true
|
||||
path: "/Users/accusys/momentry/log/monitor"
|
||||
|
||||
# n8n webhook (預設不啟用)
|
||||
n8n:
|
||||
enabled: false
|
||||
webhook_url: "http://localhost:5678/webhook/monitor-alert"
|
||||
|
||||
# Telegram (預設不啟用)
|
||||
telegram:
|
||||
enabled: false
|
||||
bot_token: ""
|
||||
chat_id: ""
|
||||
|
||||
# Email (預設不啟用)
|
||||
email:
|
||||
enabled: false
|
||||
smtp_host: ""
|
||||
smtp_port: 587
|
||||
smtp_user: ""
|
||||
smtp_password: ""
|
||||
from_address: ""
|
||||
to_addresses: []
|
||||
|
||||
# ============================================================
|
||||
# 數據保留
|
||||
# ============================================================
|
||||
retention:
|
||||
history_days: 30
|
||||
anomaly_days: 90
|
||||
session_days: 7
|
||||
login_days: 30
|
||||
|
||||
# ============================================================
|
||||
# 報警閾值
|
||||
# ============================================================
|
||||
thresholds:
|
||||
service_response_time_ms: 3000
|
||||
database_memory_percent: 80
|
||||
disk_percent: 90
|
||||
cpu_percent: 90
|
||||
login_failures_per_user: 3
|
||||
brute_force_per_minute: 5
|
||||
362
monitor/control/monitor_control.sh
Executable file
362
monitor/control/monitor_control.sh
Executable file
@@ -0,0 +1,362 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry 監控系統控制腳本
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/control/monitor_control.sh
|
||||
|
||||
set -e
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
MONITOR_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
CONFIG_DIR="$MONITOR_DIR/config"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
# 顏色定義
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m'
|
||||
|
||||
# ============================================================
|
||||
# 幫助信息
|
||||
# ============================================================
|
||||
|
||||
show_help() {
|
||||
cat << EOF
|
||||
Momentry 監控系統控制腳本
|
||||
|
||||
用法: $0 <command> [options]
|
||||
|
||||
命令:
|
||||
status 查看監控狀態
|
||||
check <layer> 執行特定層級檢查
|
||||
layers: service, workflow, portal, database, users, storage, external, backup, node, python, all
|
||||
monitor 持續監控 (每 5 分鐘)
|
||||
init 初始化監控數據庫表
|
||||
logs <layer> [lines] 查看日誌
|
||||
clean 清理歷史數據
|
||||
help 顯示幫助
|
||||
|
||||
示例:
|
||||
$0 status 查看所有監控狀態
|
||||
$0 check service 檢查服務狀態
|
||||
$0 check backup 檢查備份狀態
|
||||
$0 check all 執行全面檢查
|
||||
$0 logs anomaly 50 查看最近 50 條異常
|
||||
$0 init 初始化數據庫表
|
||||
|
||||
EOF
|
||||
}
|
||||
|
||||
# ============================================================
|
||||
# 初始化
|
||||
# ============================================================
|
||||
|
||||
init_monitor() {
|
||||
echo -e "${BLUE}初始化監控系統...${NC}"
|
||||
|
||||
# 創建日誌目錄
|
||||
mkdir -p "$LOG_DIR"
|
||||
|
||||
# 創建數據庫表
|
||||
echo "創建監控數據庫表..."
|
||||
psql -U accusys -h localhost -d momentry -f "$MONITOR_DIR/database/schema.sql" 2>/dev/null || \
|
||||
echo "數據庫表可能已存在"
|
||||
|
||||
echo -e "${GREEN}初始化完成${NC}"
|
||||
}
|
||||
|
||||
# ============================================================
|
||||
# 狀態查看
|
||||
# ============================================================
|
||||
|
||||
show_status() {
|
||||
echo ""
|
||||
echo "========================================"
|
||||
echo -e "${BLUE}Momentry 監控系統狀態${NC}"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
# 服務狀態
|
||||
echo -e "${YELLOW}Layer 2: 服務狀態${NC}"
|
||||
local service_count=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM monitor_services WHERE checked_at > NOW() - INTERVAL '5 minutes' AND status = 'up';" 2>/dev/null || echo "0")
|
||||
local service_total=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(DISTINCT service_name) FROM monitor_services;" 2>/dev/null || echo "0")
|
||||
echo " 服務: $service_count / $service_total 正常"
|
||||
echo ""
|
||||
|
||||
# Workflow 狀態
|
||||
echo -e "${YELLOW}Layer 3: Workflow 狀態${NC}"
|
||||
local active_wf=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM monitor_workflows WHERE is_active = true;" 2>/dev/null || echo "0")
|
||||
local idle_wf=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM monitor_workflows WHERE idle_days > 30;" 2>/dev/null || echo "0")
|
||||
echo " 啟用 Workflow: $active_wf"
|
||||
echo " 閒置 Workflow (>30天): $idle_wf"
|
||||
echo ""
|
||||
|
||||
# Database 狀態
|
||||
echo -e "${YELLOW}Layer 5: Database 狀態${NC}"
|
||||
local db_count=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(DISTINCT db_type) FROM monitor_databases WHERE checked_at > NOW() - INTERVAL '5 minutes';" 2>/dev/null || echo "0")
|
||||
echo " 監控資料庫: $db_count"
|
||||
echo ""
|
||||
|
||||
# 異常狀態
|
||||
echo -e "${YELLOW}Layer 6: 異常狀態${NC}"
|
||||
local critical=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM monitor_anomalies WHERE severity = 'critical' AND detected_at > NOW() - INTERVAL '24 hours';" 2>/dev/null || echo "0")
|
||||
local warning=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM monitor_anomalies WHERE severity IN ('medium', 'high') AND detected_at > NOW() - INTERVAL '24 hours';" 2>/dev/null || echo "0")
|
||||
echo " Critical: $critical"
|
||||
echo " Warning: $warning"
|
||||
echo ""
|
||||
|
||||
# Node.js 狀態
|
||||
echo -e "${YELLOW}Node.js 運行環境${NC}"
|
||||
local node_compliant=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM node_version_baseline WHERE is_compliant = true;" 2>/dev/null || echo "0")
|
||||
local node_total=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM node_version_baseline;" 2>/dev/null || echo "0")
|
||||
echo " 版本合規: $node_compliant / $node_total"
|
||||
echo ""
|
||||
|
||||
# Python 狀態
|
||||
echo -e "${YELLOW}Python 運行環境${NC}"
|
||||
local python_compliant=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM python_version_baseline WHERE is_compliant = true;" 2>/dev/null || echo "0")
|
||||
local python_total=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM python_version_baseline;" 2>/dev/null || echo "0")
|
||||
echo " 版本合規: $python_compliant / $python_total"
|
||||
echo ""
|
||||
|
||||
# 備份狀態
|
||||
echo -e "${YELLOW}Layer 7: 備份狀態${NC}"
|
||||
local total_backups=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM backup_registry WHERE created_at > NOW() - INTERVAL '7 days';" 2>/dev/null || echo "0")
|
||||
local failed_backups=$(psql -U accusys -h localhost -d momentry -t -A -c "SELECT COUNT(*) FROM backup_registry WHERE status = 'failed' AND created_at > NOW() - INTERVAL '7 days';" 2>/dev/null || echo "0")
|
||||
echo " 本週備份: $total_backups"
|
||||
echo " 失敗: $failed_backups"
|
||||
echo ""
|
||||
|
||||
echo "========================================"
|
||||
echo "使用 '$0 check <layer>' 執行檢查"
|
||||
echo ""
|
||||
}
|
||||
|
||||
# ============================================================
|
||||
# 執行檢查
|
||||
# ============================================================
|
||||
|
||||
check_layer() {
|
||||
local layer=$1
|
||||
|
||||
case $layer in
|
||||
service)
|
||||
echo -e "${BLUE}執行 Layer 2: 服務監控...${NC}"
|
||||
bash "$MONITOR_DIR/service/health_check.sh"
|
||||
;;
|
||||
workflow)
|
||||
echo -e "${BLUE}執行 Layer 3: Workflow 監控...${NC}"
|
||||
bash "$MONITOR_DIR/workflow/n8n_workflow_monitor.sh"
|
||||
;;
|
||||
portal)
|
||||
echo -e "${BLUE}執行 Layer 4: Portal 監控...${NC}"
|
||||
bash "$MONITOR_DIR/portal/page_monitor.sh"
|
||||
;;
|
||||
database)
|
||||
echo -e "${BLUE}執行 Layer 5: Database 監控...${NC}"
|
||||
bash "$MONITOR_DIR/database/postgres_monitor.sh"
|
||||
bash "$MONITOR_DIR/database/redis_monitor.sh"
|
||||
bash "$MONITOR_DIR/database/qdrant_monitor.sh"
|
||||
;;
|
||||
users)
|
||||
echo -e "${BLUE}執行 Layer 6: 使用者監控...${NC}"
|
||||
bash "$MONITOR_DIR/users/session_tracker.sh"
|
||||
;;
|
||||
storage)
|
||||
echo -e "${BLUE}執行 Layer 7: Storage 監控...${NC}"
|
||||
bash "$MONITOR_DIR/storage/storage_manager.sh" status
|
||||
;;
|
||||
backup)
|
||||
echo -e "${BLUE}執行 Layer 7: 備份監控...${NC}"
|
||||
bash "$MONITOR_DIR/storage/backup_monitor.sh" status
|
||||
;;
|
||||
external)
|
||||
echo -e "${BLUE}執行 Layer 1: External 監控...${NC}"
|
||||
bash "$MONITOR_DIR/service/external_monitor.sh"
|
||||
;;
|
||||
node)
|
||||
echo -e "${BLUE}執行 Node.js 版本監控...${NC}"
|
||||
bash "$MONITOR_DIR/service/node_monitor.sh"
|
||||
;;
|
||||
python)
|
||||
echo -e "${BLUE}執行 Python 版本監控...${NC}"
|
||||
bash "$MONITOR_DIR/service/python_monitor.sh"
|
||||
;;
|
||||
all)
|
||||
echo -e "${BLUE}執行全面監控檢查...${NC}"
|
||||
check_layer external
|
||||
check_layer service
|
||||
check_layer node
|
||||
check_layer python
|
||||
check_layer workflow
|
||||
check_layer portal
|
||||
check_layer database
|
||||
check_layer users
|
||||
echo -e "${GREEN}全面檢查完成${NC}"
|
||||
;;
|
||||
*)
|
||||
echo -e "${RED}未知層級: $layer${NC}"
|
||||
show_help
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
# ============================================================
|
||||
# 持續監控
|
||||
# ============================================================
|
||||
|
||||
run_monitor() {
|
||||
echo -e "${BLUE}開始持續監控 (每 5 分鐘)${NC}"
|
||||
echo "按 Ctrl+C 停止"
|
||||
echo ""
|
||||
|
||||
while true; do
|
||||
local start_time=$(date +%s)
|
||||
|
||||
check_layer all
|
||||
|
||||
local end_time=$(date +%s)
|
||||
local elapsed=$((end_time - start_time))
|
||||
|
||||
if [ $elapsed -lt 300 ]; then
|
||||
sleep $((300 - elapsed))
|
||||
fi
|
||||
done
|
||||
}
|
||||
|
||||
# ============================================================
|
||||
# 查看日誌
|
||||
# ============================================================
|
||||
|
||||
show_logs() {
|
||||
local layer=${1:-anomaly}
|
||||
local lines=${2:-20}
|
||||
|
||||
case $layer in
|
||||
anomaly)
|
||||
echo -e "${BLUE}最近異常記錄:${NC}"
|
||||
psql -U accusys -h localhost -d momentry -c "
|
||||
SELECT
|
||||
TO_CHAR(detected_at, 'YYYY-MM-DD HH24:MI:SS') as time,
|
||||
severity,
|
||||
anomaly_type,
|
||||
username,
|
||||
LEFT(description, 50) as desc
|
||||
FROM monitor_anomalies
|
||||
ORDER BY detected_at DESC
|
||||
LIMIT $lines;
|
||||
" 2>/dev/null || echo "無法連接資料庫"
|
||||
;;
|
||||
service)
|
||||
echo -e "${BLUE}最近服務狀態:${NC}"
|
||||
psql -U accusys -h localhost -d momentry -c "
|
||||
SELECT
|
||||
service_name,
|
||||
status,
|
||||
response_time_ms,
|
||||
TO_CHAR(checked_at, 'YYYY-MM-DD HH24:MI:SS') as time
|
||||
FROM monitor_services
|
||||
ORDER BY checked_at DESC
|
||||
LIMIT $lines;
|
||||
" 2>/dev/null || echo "無法連接資料庫"
|
||||
;;
|
||||
workflow)
|
||||
echo -e "${BLUE}最近 Workflow 狀態:${NC}"
|
||||
psql -U accusys -h localhost -d momentry -c "
|
||||
SELECT
|
||||
workflow_name,
|
||||
is_active,
|
||||
idle_days,
|
||||
suggestion,
|
||||
TO_CHAR(checked_at, 'YYYY-MM-DD HH24:MI:SS') as time
|
||||
FROM monitor_workflows
|
||||
ORDER BY checked_at DESC
|
||||
LIMIT $lines;
|
||||
" 2>/dev/null || echo "無法連接資料庫"
|
||||
;;
|
||||
*)
|
||||
echo -e "${RED}未知日誌類型: $layer${NC}"
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
# ============================================================
|
||||
# 清理歷史數據
|
||||
# ============================================================
|
||||
|
||||
clean_history() {
|
||||
echo -e "${YELLOW}清理歷史數據...${NC}"
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null || true
|
||||
-- 保留 30 天
|
||||
DELETE FROM monitor_services WHERE checked_at < NOW() - INTERVAL '30 days';
|
||||
DELETE FROM monitor_workflows WHERE checked_at < NOW() - INTERVAL '30 days';
|
||||
DELETE FROM monitor_databases WHERE checked_at < NOW() - INTERVAL '30 days';
|
||||
DELETE FROM monitor_external WHERE checked_at < NOW() - INTERVAL '30 days';
|
||||
DELETE FROM monitor_portal_pages WHERE checked_at < NOW() - INTERVAL '30 days';
|
||||
|
||||
-- 保留 30 天版本基線
|
||||
DELETE FROM node_version_baseline WHERE checked_at < NOW() - INTERVAL '30 days';
|
||||
DELETE FROM node_process_tracking WHERE checked_at < NOW() - INTERVAL '30 days';
|
||||
DELETE FROM python_version_baseline WHERE checked_at < NOW() - INTERVAL '30 days';
|
||||
DELETE FROM python_script_tracking WHERE checked_at < NOW() - INTERVAL '30 days';
|
||||
|
||||
-- 保留 7 天會話
|
||||
DELETE FROM monitor_sessions WHERE connected_at < NOW() - INTERVAL '7 days';
|
||||
|
||||
-- 保留 30 天登入
|
||||
DELETE FROM monitor_logins WHERE login_at < NOW() - INTERVAL '30 days';
|
||||
|
||||
-- 保留 90 天異常
|
||||
DELETE FROM monitor_anomalies WHERE detected_at < NOW() - INTERVAL '90 days';
|
||||
|
||||
-- 清理空閒空間
|
||||
VACUUM ANALYZE;
|
||||
EOF
|
||||
|
||||
echo -e "${GREEN}清理完成${NC}"
|
||||
}
|
||||
|
||||
# ============================================================
|
||||
# 主程序
|
||||
# ============================================================
|
||||
|
||||
main() {
|
||||
# 確保日誌目錄存在
|
||||
mkdir -p "$LOG_DIR"
|
||||
|
||||
local command=${1:-status}
|
||||
|
||||
case $command in
|
||||
status)
|
||||
show_status
|
||||
;;
|
||||
check)
|
||||
check_layer ${2:-all}
|
||||
;;
|
||||
monitor)
|
||||
run_monitor
|
||||
;;
|
||||
init)
|
||||
init_monitor
|
||||
;;
|
||||
logs)
|
||||
show_logs ${2:-anomaly} ${3:-20}
|
||||
;;
|
||||
clean)
|
||||
clean_history
|
||||
;;
|
||||
help|--help|-h)
|
||||
show_help
|
||||
;;
|
||||
*)
|
||||
echo -e "${RED}未知命令: $command${NC}"
|
||||
show_help
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
main "$@"
|
||||
82
monitor/database/mongodb_monitor.sh
Executable file
82
monitor/database/mongodb_monitor.sh
Executable file
@@ -0,0 +1,82 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry MongoDB 監控 (Layer 5)
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/database/mongodb_monitor.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/mongodb_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
MONGO_USER="accusys"
|
||||
MONGO_PASS="Test3200Test3200"
|
||||
|
||||
record_metric() {
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_databases (db_type, db_name, metric_name, metric_value, checked_at)
|
||||
VALUES ('mongodb', 'mongodb', '$1', '$2', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
get_status() {
|
||||
mongosh --quiet --username "$MONGO_USER" --password "$MONGO_PASS" --authenticationDatabase admin --eval "JSON.stringify(db.adminCommand({ replSetGetStatus: 1 }))" 2>/dev/null || echo "{}"
|
||||
}
|
||||
|
||||
get_server_status() {
|
||||
mongosh --quiet --username "$MONGO_USER" --password "$MONGO_PASS" --authenticationDatabase admin --eval "JSON.stringify(db.serverStatus()))" 2>/dev/null || echo "{}"
|
||||
}
|
||||
|
||||
get_databases() {
|
||||
mongosh --quiet --username "$MONGO_USER" --password "$MONGO_PASS" --authenticationDatabase admin --eval "JSON.stringify(db.adminCommand({ listDatabases: 1 }))" 2>/dev/null || echo "{}"
|
||||
}
|
||||
|
||||
echo "========================================"
|
||||
echo "Layer 5: MongoDB Monitoring"
|
||||
echo "Time: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
if ! mongosh --quiet --username "$MONGO_USER" --password "$MONGO_PASS" --authenticationDatabase admin --eval "db.adminCommand('ping')" > /dev/null 2>&1; then
|
||||
echo "MongoDB 不可用"
|
||||
log "MongoDB unavailable"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✓ MongoDB 連接正常"
|
||||
echo ""
|
||||
|
||||
echo "資料庫:"
|
||||
echo "----------------------------------------"
|
||||
databases=$(get_databases)
|
||||
echo "$databases" | jq -r '.databases[] | " \(.name): \(.sizeOnDisk / 1024 / 1024 | floor)MB"' 2>/dev/null || echo " 無法獲取資料庫列表"
|
||||
|
||||
echo ""
|
||||
echo "伺服器狀態:"
|
||||
echo "----------------------------------------"
|
||||
server_status=$(get_server_status)
|
||||
connections=$(echo "$server_status" | jq -r '.connections.current' 2>/dev/null || echo "N/A")
|
||||
active_connections=$(echo "$server_status" | jq -r '.connections.active' 2>/dev/null || echo "N/A")
|
||||
uptime=$(echo "$server_status" | jq -r '.uptime' 2>/dev/null || echo "N/A")
|
||||
mem_resident=$(echo "$server_status" | jq -r '.mem.resident' 2>/dev/null || echo "N/A")
|
||||
|
||||
echo " 當前連接: $connections"
|
||||
echo " 活躍連接: $active_connections"
|
||||
echo " 運行時間: ${uptime}秒"
|
||||
echo " 記憶體使用: ${mem_resident}MB"
|
||||
|
||||
record_metric "connections" "$connections"
|
||||
record_metric "active_connections" "$active_connections"
|
||||
record_metric "uptime" "$uptime"
|
||||
record_metric "mem_resident" "$mem_resident"
|
||||
|
||||
echo ""
|
||||
log "MongoDB check completed: connections=$connections"
|
||||
|
||||
echo "========================================"
|
||||
echo "完成"
|
||||
echo "========================================"
|
||||
130
monitor/database/postgres_monitor.sh
Executable file
130
monitor/database/postgres_monitor.sh
Executable file
@@ -0,0 +1,130 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry PostgreSQL 監控 (Layer 5)
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/database/postgres_monitor.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
MONITOR_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/postgres_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
# 記錄指標
|
||||
record_metric() {
|
||||
local db_type="postgresql"
|
||||
local db_name=$1
|
||||
local metric_name=$2
|
||||
local value=$3
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_databases (db_type, db_name, metric_name, metric_value, checked_at)
|
||||
VALUES ('$db_type', '$db_name', '$metric_name', '$value', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# 獲取資料庫列表
|
||||
get_databases() {
|
||||
psql -U accusys -h localhost -t -A -c "SELECT datname FROM pg_database WHERE datistemplate = false;" 2>/dev/null
|
||||
}
|
||||
|
||||
# 獲取表大小
|
||||
get_table_sizes() {
|
||||
local db=$1
|
||||
psql -U accusys -h localhost -d "$db" -t -A -c "
|
||||
SELECT
|
||||
schemaname,
|
||||
tablename,
|
||||
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size,
|
||||
n_live_tup as rows
|
||||
FROM pg_stat_user_tables
|
||||
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
|
||||
LIMIT 10;
|
||||
" 2>/dev/null
|
||||
}
|
||||
|
||||
# 獲取連線數
|
||||
get_connections() {
|
||||
psql -U accusys -h localhost -t -A -c "
|
||||
SELECT state, count(*)
|
||||
FROM pg_stat_activity
|
||||
WHERE datname = current_database()
|
||||
GROUP BY state;
|
||||
" 2>/dev/null
|
||||
}
|
||||
|
||||
# 獲取慢查詢
|
||||
get_slow_queries() {
|
||||
psql -U accusys -h localhost -t -A -c "
|
||||
SELECT query, calls, mean_time
|
||||
FROM pg_stat_statements
|
||||
WHERE query NOT LIKE '%pg_stat_statements%'
|
||||
ORDER BY mean_time DESC
|
||||
LIMIT 5;
|
||||
" 2>/dev/null
|
||||
}
|
||||
|
||||
# 主程序
|
||||
echo "========================================"
|
||||
echo "Layer 5: PostgreSQL Database Monitoring"
|
||||
echo "Time: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
# 檢查 PostgreSQL 是否可用
|
||||
if ! pg_isready -h localhost -p 5432 -U accusys > /dev/null 2>&1; then
|
||||
echo "PostgreSQL 不可用"
|
||||
log "PostgreSQL unavailable"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 記錄連線數
|
||||
connections=$(get_connections)
|
||||
echo "連線狀態:"
|
||||
echo "$connections"
|
||||
echo ""
|
||||
|
||||
# 記錄指標
|
||||
record_metric "postgres" "connections" "'$connections'"
|
||||
|
||||
# 檢查各資料庫
|
||||
echo "資料庫表:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
for db in $(get_databases); do
|
||||
echo ""
|
||||
echo "資料庫: $db"
|
||||
|
||||
table_count=$(psql -U accusys -h localhost -d "$db" -t -A -c "SELECT count(*) FROM information_schema.tables WHERE table_schema = 'public';" 2>/dev/null || echo "0")
|
||||
echo " 表數量: $table_count"
|
||||
|
||||
record_metric "$db" "table_count" "$table_count"
|
||||
|
||||
# 顯示大表
|
||||
echo " 大表:"
|
||||
get_table_sizes "$db" | while read -r schema table size rows; do
|
||||
[ -z "$table" ] && continue
|
||||
echo " - $table: $size ($rows rows)"
|
||||
done
|
||||
done
|
||||
|
||||
# 檢查慢查詢(如果 pg_stat_statements 可用)
|
||||
echo ""
|
||||
echo "慢查詢 (如有):"
|
||||
slow_queries=$(get_slow_queries)
|
||||
if [ -n "$slow_queries" ]; then
|
||||
echo "$slow_queries" | while read -r query calls time; do
|
||||
[ -z "$query" ] && continue
|
||||
echo " - ${time}ms (調用 $calls 次)"
|
||||
done
|
||||
else
|
||||
echo " (pg_stat_statements 未啟用)"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
log "PostgreSQL check completed"
|
||||
124
monitor/database/qdrant_monitor.sh
Executable file
124
monitor/database/qdrant_monitor.sh
Executable file
@@ -0,0 +1,124 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry Qdrant 監控 (Layer 5)
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/database/qdrant_monitor.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/qdrant_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
QDRANT_HOST="http://localhost:6333"
|
||||
QDRANT_API_KEY="Test3200Test3200Test3200"
|
||||
|
||||
# 記錄指標
|
||||
record_metric() {
|
||||
local collection=$1
|
||||
local metric_name=$2
|
||||
local value=$3
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_databases (db_type, db_name, metric_name, metric_value, checked_at)
|
||||
VALUES ('qdrant', '$collection', '$metric_name', '$value', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# 記錄 Collection
|
||||
record_collection() {
|
||||
local name=$1
|
||||
local vectors=$2
|
||||
local points=$3
|
||||
local disk_size=$4
|
||||
local status=$5
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_qdrant_collections (collection_name, vectors_count, points_count, disk_size_bytes, status, snapshot_at)
|
||||
VALUES ('$name', $vectors, $points, $disk_size, '$status', NOW())
|
||||
ON CONFLICT (collection_name) DO UPDATE SET
|
||||
vectors_count = EXCLUDED.vectors_count,
|
||||
points_count = EXCLUDED.points_count,
|
||||
disk_size_bytes = EXCLUDED.disk_size_bytes,
|
||||
status = EXCLUDED.status,
|
||||
snapshot_at = NOW();
|
||||
EOF
|
||||
}
|
||||
|
||||
# 主程序
|
||||
echo "========================================"
|
||||
echo "Layer 5: Qdrant Vector Database Monitoring"
|
||||
echo "Time: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
# 檢查 Qdrant 是否可用
|
||||
http_code=$(curl -s -o /dev/null -w "%{http_code}" \
|
||||
-H "api-key: $QDRANT_API_KEY" \
|
||||
"$QDRANT_HOST/collections" 2>/dev/null || echo "000")
|
||||
if [ "$http_code" = "000" ]; then
|
||||
echo "Qdrant 不可用"
|
||||
log "Qdrant unavailable"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Qdrant 狀態: HTTP $http_code"
|
||||
echo ""
|
||||
|
||||
# 獲取 Collection 列表
|
||||
collections=$(curl -s -H "api-key: $QDRANT_API_KEY" "$QDRANT_HOST/collections" 2>/dev/null)
|
||||
|
||||
if [ "$http_code" = "200" ]; then
|
||||
collection_count=$(echo "$collections" | jq '.result.collections | length' 2>/dev/null || echo "0")
|
||||
echo "Collection 數量: $collection_count"
|
||||
echo ""
|
||||
|
||||
# 遍歷每個 Collection
|
||||
echo "Collections:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
for i in $(seq 0 $((collection_count - 1))); do
|
||||
name=$(echo "$collections" | jq -r ".result.collections[$i].name")
|
||||
|
||||
# 獲取 Collection 詳情
|
||||
details=$(curl -s -H "api-key: $QDRANT_API_KEY" "$QDRANT_HOST/collections/$name" 2>/dev/null)
|
||||
|
||||
vectors_count=$(echo "$details" | jq -r '.result.indexed_vectors_count // 0' 2>/dev/null || echo "0")
|
||||
points_count=$(echo "$details" | jq -r '.result.points_count // 0' 2>/dev/null || echo "0")
|
||||
disk_size=$(echo "$details" | jq -r '.result.disk_size_bytes // 0' 2>/dev/null || echo "0")
|
||||
status=$(echo "$details" | jq -r '.result.status // "unknown"' 2>/dev/null || echo "unknown")
|
||||
|
||||
# 格式化大小
|
||||
if [ "$disk_size" -gt 1073741824 ]; then
|
||||
size_str="$((disk_size / 1073741824))GB"
|
||||
elif [ "$disk_size" -gt 1048576 ]; then
|
||||
size_str="$((disk_size / 1048576))MB"
|
||||
elif [ "$disk_size" -gt 1024 ]; then
|
||||
size_str="$((disk_size / 1024))KB"
|
||||
else
|
||||
size_str="${disk_size}B"
|
||||
fi
|
||||
|
||||
echo " - $name"
|
||||
echo " 狀態: $status"
|
||||
echo " Vectors: $vectors_count"
|
||||
echo " Points: $points_count"
|
||||
echo " 大小: $size_str"
|
||||
|
||||
# 記錄到資料庫
|
||||
record_collection "$name" "$vectors_count" "$points_count" "$disk_size" "$status"
|
||||
record_metric "$name" "vectors_count" "$vectors_count"
|
||||
record_metric "$name" "points_count" "$points_count"
|
||||
record_metric "$name" "disk_size" "$disk_size"
|
||||
done
|
||||
else
|
||||
echo "無法獲取 Collection 列表"
|
||||
log "Failed to get collections: HTTP $http_code"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
log "Qdrant check completed"
|
||||
111
monitor/database/redis_monitor.sh
Executable file
111
monitor/database/redis_monitor.sh
Executable file
@@ -0,0 +1,111 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry Redis 監控 (Layer 5)
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/database/redis_monitor.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/redis_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
REDIS_PASS="accusys"
|
||||
|
||||
# 記錄指標
|
||||
record_metric() {
|
||||
local value=$1
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_databases (db_type, db_name, metric_name, metric_value, checked_at)
|
||||
VALUES ('redis', 'redis', '$1', '$2', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# 獲取 Redis INFO
|
||||
get_info() {
|
||||
redis-cli -a "$REDIS_PASS" INFO 2>/dev/null
|
||||
}
|
||||
|
||||
# 主程序
|
||||
echo "========================================"
|
||||
echo "Layer 5: Redis Monitoring"
|
||||
echo "Time: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
# 檢查 Redis 是否可用
|
||||
if ! redis-cli -a "$REDIS_PASS" ping > /dev/null 2>&1; then
|
||||
echo "Redis 不可用"
|
||||
log "Redis unavailable"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
info=$(get_info)
|
||||
|
||||
# 提取關鍵指標
|
||||
echo "關鍵指標:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
# 內存使用
|
||||
used_memory=$(echo "$info" | grep "^used_memory_human:" | cut -d: -f2 | tr -d '\r')
|
||||
echo " 內存使用: $used_memory"
|
||||
|
||||
# 連線數
|
||||
connected_clients=$(echo "$info" | grep "^connected_clients:" | cut -d: -f2 | tr -d '\r')
|
||||
echo " 客戶端連線: $connected_clients"
|
||||
|
||||
# 命中率
|
||||
keyspace_hits=$(echo "$info" | grep "^keyspace_hits:" | cut -d: -f2 | tr -d '\r')
|
||||
keyspace_misses=$(echo "$info" | grep "^keyspace_misses:" | cut -d: -f2 | tr -d '\r')
|
||||
total_ops=$((keyspace_hits + keyspace_misses))
|
||||
if [ $total_ops -gt 0 ]; then
|
||||
hit_rate=$((keyspace_hits * 100 / total_ops))
|
||||
echo " 命中率: ${hit_rate}%"
|
||||
else
|
||||
echo " 命中率: N/A"
|
||||
fi
|
||||
|
||||
# 持久化
|
||||
rdb_changes=$(echo "$info" | grep "^rdb_changes_since_last_save:" | cut -d: -f2 | tr -d '\r')
|
||||
echo " RDB 變更: $rdb_changes"
|
||||
|
||||
# 總鍵數
|
||||
echo ""
|
||||
echo "鍵數據庫:"
|
||||
db0_info=$(echo "$info" | grep "^db0:" | head -1)
|
||||
if [ -n "$db0_info" ]; then
|
||||
keys=$(echo "$db0_info" | sed 's/.*keys=\([0-9]*\).*/\1/')
|
||||
expires=$(echo "$db0_info" | sed 's/.*expires=\([0-9]*\).*/\1/')
|
||||
echo " db0: $keys keys, $expires 有過期時間"
|
||||
fi
|
||||
|
||||
# 記錄到資料庫
|
||||
record_metric "used_memory" "'$used_memory'"
|
||||
record_metric "connected_clients" "$connected_clients"
|
||||
record_metric "keyspace_hits" "$keyspace_hits"
|
||||
record_metric "keyspace_misses" "$keyspace_misses"
|
||||
|
||||
# 檢查閾值
|
||||
echo ""
|
||||
echo "閾值檢查:"
|
||||
memory_percent=$(echo "$info" | grep "^used_memory:" | cut -d: -f2)
|
||||
maxmemory=$(redis-cli -a "$REDIS_PASS" CONFIG GET maxmemory 2>/dev/null | tail -1)
|
||||
if [ -n "$maxmemory" ] && [ "$maxmemory" -gt 0 ]; then
|
||||
mem_pct=$((memory_percent * 100 / maxmemory))
|
||||
echo " 內存使用: ${mem_pct}%"
|
||||
if [ $mem_pct -gt 80 ]; then
|
||||
echo " ⚠️ 內存使用超過 80%"
|
||||
fi
|
||||
fi
|
||||
|
||||
if [ $connected_clients -gt 100 ]; then
|
||||
echo " ⚠️ 客戶端連線過多"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
log "Redis check completed"
|
||||
492
monitor/database/schema.sql
Normal file
492
monitor/database/schema.sql
Normal file
@@ -0,0 +1,492 @@
|
||||
-- Momentry 監控系統數據庫表
|
||||
-- 使用方式: psql -U accusys -h localhost -d momentry -f schema.sql
|
||||
|
||||
-- ============================================================
|
||||
-- Layer 2: Service 監控
|
||||
-- ============================================================
|
||||
|
||||
CREATE TABLE IF NOT EXISTS monitor_services (
|
||||
id SERIAL PRIMARY KEY,
|
||||
service_name VARCHAR(50) NOT NULL,
|
||||
service_type VARCHAR(20),
|
||||
port INTEGER,
|
||||
status VARCHAR(20) CHECK (status IN ('up', 'down', 'degraded', 'unknown')),
|
||||
response_time_ms INTEGER,
|
||||
error_message TEXT,
|
||||
checked_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_monitor_services_name ON monitor_services(service_name);
|
||||
CREATE INDEX idx_monitor_services_time ON monitor_services(checked_at);
|
||||
|
||||
-- ============================================================
|
||||
-- Layer 3: n8n Workflow 監控
|
||||
-- ============================================================
|
||||
|
||||
CREATE TABLE IF NOT EXISTS monitor_workflows (
|
||||
id SERIAL PRIMARY KEY,
|
||||
workflow_id VARCHAR(50) NOT NULL,
|
||||
workflow_name VARCHAR(255),
|
||||
workflow_type VARCHAR(50),
|
||||
is_active BOOLEAN DEFAULT FALSE,
|
||||
last_executed_at TIMESTAMP,
|
||||
execution_count INTEGER DEFAULT 0,
|
||||
success_count INTEGER DEFAULT 0,
|
||||
failure_count INTEGER DEFAULT 0,
|
||||
avg_duration_ms INTEGER,
|
||||
has_schedule BOOLEAN DEFAULT FALSE,
|
||||
has_webhook BOOLEAN DEFAULT FALSE,
|
||||
idle_days INTEGER,
|
||||
suggestion VARCHAR(100),
|
||||
checked_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_monitor_workflows_id ON monitor_workflows(workflow_id);
|
||||
CREATE INDEX idx_monitor_workflows_active ON monitor_workflows(is_active);
|
||||
CREATE INDEX idx_monitor_workflows_idle ON monitor_workflows(idle_days);
|
||||
|
||||
-- ============================================================
|
||||
-- Layer 4: WordPress Portal 監控
|
||||
-- ============================================================
|
||||
|
||||
CREATE TABLE IF NOT EXISTS monitor_portal_pages (
|
||||
id SERIAL PRIMARY KEY,
|
||||
page_url VARCHAR(500) NOT NULL,
|
||||
page_type VARCHAR(20),
|
||||
is_accessible BOOLEAN,
|
||||
response_time_ms INTEGER,
|
||||
http_status INTEGER,
|
||||
error_message TEXT,
|
||||
checked_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS monitor_portal_users (
|
||||
id SERIAL PRIMARY KEY,
|
||||
user_id BIGINT,
|
||||
username VARCHAR(100),
|
||||
email VARCHAR(255),
|
||||
role VARCHAR(50),
|
||||
is_active BOOLEAN,
|
||||
last_login TIMESTAMP,
|
||||
created_at TIMESTAMP,
|
||||
detected_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_monitor_portal_pages_url ON monitor_portal_pages(page_url);
|
||||
CREATE INDEX idx_monitor_portal_users_username ON monitor_portal_users(username);
|
||||
|
||||
-- ============================================================
|
||||
-- Layer 5: Database 監控
|
||||
-- ============================================================
|
||||
|
||||
CREATE TABLE IF NOT EXISTS monitor_databases (
|
||||
id SERIAL PRIMARY KEY,
|
||||
db_type VARCHAR(20) NOT NULL CHECK (db_type IN ('postgresql', 'redis', 'qdrant', 'mariadb', 'mongodb')),
|
||||
db_name VARCHAR(50),
|
||||
metric_name VARCHAR(50) NOT NULL,
|
||||
metric_value JSONB,
|
||||
checked_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_monitor_databases_type ON monitor_databases(db_type);
|
||||
CREATE INDEX idx_monitor_databases_time ON monitor_databases(checked_at);
|
||||
|
||||
-- PostgreSQL 表結構快照
|
||||
CREATE TABLE IF NOT EXISTS monitor_pg_tables (
|
||||
id SERIAL PRIMARY KEY,
|
||||
database_name VARCHAR(50),
|
||||
schema_name VARCHAR(50),
|
||||
table_name VARCHAR(100),
|
||||
table_type VARCHAR(20),
|
||||
row_count BIGINT,
|
||||
table_size_bytes BIGINT,
|
||||
index_size_bytes BIGINT,
|
||||
snapshot_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- 表結構變更記錄
|
||||
CREATE TABLE IF NOT EXISTS monitor_pg_schema_changes (
|
||||
id SERIAL PRIMARY KEY,
|
||||
database_name VARCHAR(50),
|
||||
schema_name VARCHAR(50),
|
||||
table_name VARCHAR(100),
|
||||
change_type VARCHAR(20) CHECK (change_type IN ('table_created', 'table_dropped', 'column_added', 'column_removed', 'column_type_changed')),
|
||||
column_name VARCHAR(100),
|
||||
old_value TEXT,
|
||||
new_value TEXT,
|
||||
detected_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- Qdrant Collection 監控
|
||||
CREATE TABLE IF NOT EXISTS monitor_qdrant_collections (
|
||||
id SERIAL PRIMARY KEY,
|
||||
collection_name VARCHAR(100),
|
||||
vectors_count BIGINT,
|
||||
points_count BIGINT,
|
||||
disk_size_bytes BIGINT,
|
||||
status VARCHAR(20),
|
||||
snapshot_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- ============================================================
|
||||
-- Layer 6: 使用者監控
|
||||
-- ============================================================
|
||||
|
||||
-- 連線會話追蹤
|
||||
CREATE TABLE IF NOT EXISTS monitor_sessions (
|
||||
id SERIAL PRIMARY KEY,
|
||||
session_type VARCHAR(20) CHECK (session_type IN ('ssh', 'web', 'db', 'sftp', 'rdp')),
|
||||
service_name VARCHAR(50),
|
||||
username VARCHAR(100),
|
||||
source_ip VARCHAR(45),
|
||||
source_port INTEGER,
|
||||
connected_at TIMESTAMP,
|
||||
last_activity_at TIMESTAMP,
|
||||
disconnected_at TIMESTAMP,
|
||||
bytes_sent BIGINT,
|
||||
bytes_received BIGINT,
|
||||
status VARCHAR(20) CHECK (status IN ('active', 'disconnected', 'timeout'))
|
||||
);
|
||||
|
||||
-- 登入歷史
|
||||
CREATE TABLE IF NOT EXISTS monitor_logins (
|
||||
id SERIAL PRIMARY KEY,
|
||||
user_type VARCHAR(20) CHECK (user_type IN ('system', 'wordpress', 'n8n', 'gitea', 'sftpgo', 'database')),
|
||||
username VARCHAR(100),
|
||||
source_ip VARCHAR(45),
|
||||
user_agent TEXT,
|
||||
login_method VARCHAR(20),
|
||||
success BOOLEAN,
|
||||
failure_reason VARCHAR(200),
|
||||
login_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- sudo 命令記錄
|
||||
CREATE TABLE IF NOT EXISTS monitor_sudo_history (
|
||||
id SERIAL PRIMARY KEY,
|
||||
username VARCHAR(100),
|
||||
command TEXT,
|
||||
run_as VARCHAR(100),
|
||||
tty VARCHAR(50),
|
||||
source_ip VARCHAR(45),
|
||||
exit_code INTEGER,
|
||||
executed_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- 資源使用追蹤
|
||||
CREATE TABLE IF NOT EXISTS monitor_resource_usage (
|
||||
id SERIAL PRIMARY KEY,
|
||||
user_type VARCHAR(20),
|
||||
username VARCHAR(100),
|
||||
service_name VARCHAR(50),
|
||||
cpu_percent DECIMAL(5,2),
|
||||
memory_mb INTEGER,
|
||||
disk_io_read_mb BIGINT,
|
||||
disk_io_write_mb BIGINT,
|
||||
network_rx_mb BIGINT,
|
||||
network_tx_mb BIGINT,
|
||||
recorded_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- 異常檢測記錄
|
||||
CREATE TABLE IF NOT EXISTS monitor_anomalies (
|
||||
id SERIAL PRIMARY KEY,
|
||||
anomaly_type VARCHAR(50) CHECK (anomaly_type IN ('brute_force', 'privilege_escalation', 'unusual_access', 'unusual_time', 'excessive_queries', 'idle_session', 'schema_change')),
|
||||
severity VARCHAR(20) CHECK (severity IN ('low', 'medium', 'high', 'critical')),
|
||||
source_type VARCHAR(20),
|
||||
username VARCHAR(100),
|
||||
source_ip VARCHAR(45),
|
||||
description TEXT,
|
||||
details JSONB,
|
||||
detected_at TIMESTAMP DEFAULT NOW(),
|
||||
resolved BOOLEAN DEFAULT FALSE,
|
||||
resolved_at TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE INDEX idx_monitor_sessions_type ON monitor_sessions(session_type);
|
||||
CREATE INDEX idx_monitor_sessions_username ON monitor_sessions(username);
|
||||
CREATE INDEX idx_monitor_logins_type ON monitor_logins(user_type);
|
||||
CREATE INDEX idx_monitor_logins_time ON monitor_logins(login_at);
|
||||
CREATE INDEX idx_monitor_anomalies_type ON monitor_anomalies(anomaly_type);
|
||||
CREATE INDEX idx_monitor_anomalies_severity ON monitor_anomalies(severity);
|
||||
CREATE INDEX idx_monitor_anomalies_time ON monitor_anomalies(detected_at);
|
||||
|
||||
-- ============================================================
|
||||
-- Layer 7: Storage 監控
|
||||
-- ============================================================
|
||||
|
||||
-- 檔案註冊表
|
||||
CREATE TABLE IF NOT EXISTS file_registry (
|
||||
file_uuid UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
file_name VARCHAR(255) NOT NULL,
|
||||
file_path TEXT NOT NULL,
|
||||
file_path_hash VARCHAR(64) NOT NULL,
|
||||
file_size BIGINT NOT NULL,
|
||||
file_hash VARCHAR(64),
|
||||
mime_type VARCHAR(100),
|
||||
user_cluster VARCHAR(50) CHECK (user_cluster IN ('family', 'work', 'wordpress', 'shared', 'system')),
|
||||
owner_id VARCHAR(100),
|
||||
storage_tier VARCHAR(20) DEFAULT 'hot' CHECK (storage_tier IN ('hot', 'warm', 'cold')),
|
||||
storage_location VARCHAR(500),
|
||||
status VARCHAR(20) DEFAULT 'active' CHECK (status IN ('active', 'temporary', 'archived', 'deleted')),
|
||||
is_registered BOOLEAN DEFAULT TRUE,
|
||||
created_at TIMESTAMP DEFAULT NOW(),
|
||||
updated_at TIMESTAMP DEFAULT NOW(),
|
||||
last_accessed_at TIMESTAMP,
|
||||
access_count INTEGER DEFAULT 0,
|
||||
archived_at TIMESTAMP,
|
||||
archive_location VARCHAR(500),
|
||||
retention_until TIMESTAMP,
|
||||
UNIQUE(file_path_hash)
|
||||
);
|
||||
|
||||
-- 存儲使用統計
|
||||
CREATE TABLE IF NOT EXISTS storage_usage_stats (
|
||||
id SERIAL PRIMARY KEY,
|
||||
user_cluster VARCHAR(50),
|
||||
storage_tier VARCHAR(20),
|
||||
file_count BIGINT,
|
||||
total_size_bytes BIGINT,
|
||||
record_time TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- 文件訪問日誌
|
||||
CREATE TABLE IF NOT EXISTS storage_access_logs (
|
||||
id SERIAL PRIMARY KEY,
|
||||
user_cluster VARCHAR(50),
|
||||
owner_id VARCHAR(100),
|
||||
file_path TEXT,
|
||||
access_type VARCHAR(20) CHECK (access_type IN ('read', 'write', 'delete', 'download', 'move')),
|
||||
access_time TIMESTAMP DEFAULT NOW(),
|
||||
client_ip VARCHAR(45),
|
||||
access_method VARCHAR(20)
|
||||
);
|
||||
|
||||
-- 文件生命週期
|
||||
CREATE TABLE IF NOT EXISTS file_lifecycle (
|
||||
id SERIAL PRIMARY KEY,
|
||||
file_uuid UUID REFERENCES file_registry(file_uuid),
|
||||
file_path TEXT,
|
||||
user_cluster VARCHAR(50),
|
||||
storage_tier VARCHAR(20),
|
||||
created_at TIMESTAMP,
|
||||
last_accessed_at TIMESTAMP,
|
||||
last_modified_at TIMESTAMP,
|
||||
access_count INTEGER DEFAULT 0,
|
||||
current_status VARCHAR(20) DEFAULT 'active',
|
||||
tier_migration_count INTEGER DEFAULT 0,
|
||||
migrated_at TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE INDEX idx_file_registry_cluster ON file_registry(user_cluster);
|
||||
CREATE INDEX idx_file_registry_tier ON file_registry(storage_tier);
|
||||
CREATE INDEX idx_file_registry_status ON file_registry(status);
|
||||
CREATE INDEX idx_storage_usage_cluster ON storage_usage_stats(user_cluster);
|
||||
CREATE INDEX idx_storage_usage_time ON storage_usage_stats(record_time);
|
||||
|
||||
-- ============================================================
|
||||
-- 外部監控 (Layer 1)
|
||||
-- ============================================================
|
||||
|
||||
CREATE TABLE IF NOT EXISTS monitor_external (
|
||||
id SERIAL PRIMARY KEY,
|
||||
target_name VARCHAR(50) NOT NULL,
|
||||
target_type VARCHAR(20) CHECK (target_type IN ('ddns', 'gateway', 'internet', 'api')),
|
||||
target_host VARCHAR(255),
|
||||
is_reachable BOOLEAN,
|
||||
response_time_ms INTEGER,
|
||||
dns_resolved_ip VARCHAR(45),
|
||||
error_message TEXT,
|
||||
checked_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_monitor_external_name ON monitor_external(target_name);
|
||||
CREATE INDEX idx_monitor_external_time ON monitor_external(checked_at);
|
||||
|
||||
-- ============================================================
|
||||
-- 監控配置表
|
||||
-- ============================================================
|
||||
|
||||
CREATE TABLE IF NOT EXISTS monitor_config (
|
||||
id SERIAL PRIMARY KEY,
|
||||
config_key VARCHAR(50) UNIQUE NOT NULL,
|
||||
config_value TEXT,
|
||||
description VARCHAR(255),
|
||||
updated_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- 插入默認配置
|
||||
INSERT INTO monitor_config (config_key, config_value, description) VALUES
|
||||
('check_interval', '300', '監控檢查間隔(秒)'),
|
||||
('retention_days', '30', '歷史數據保留天數'),
|
||||
('idle_threshold_days', '30', 'Workflow 閒置天數閾值'),
|
||||
('alert_threshold_bruteforce', '5', '暴力破解嘗試次數閾值'),
|
||||
('alert_threshold_slow_response', '3000', '響應時間閾值(毫秒)')
|
||||
ON CONFLICT (config_key) DO NOTHING;
|
||||
|
||||
-- ============================================================
|
||||
-- 視圖定義
|
||||
-- ============================================================
|
||||
|
||||
-- 服務健康狀態視圖
|
||||
CREATE OR REPLACE VIEW v_service_health AS
|
||||
SELECT
|
||||
service_name,
|
||||
status,
|
||||
COUNT(*) as check_count,
|
||||
COUNT(*) FILTER (WHERE status = 'up') as up_count,
|
||||
COUNT(*) FILTER (WHERE status = 'down') as down_count,
|
||||
AVG(response_time_ms) as avg_response_time,
|
||||
MAX(checked_at) as last_check
|
||||
FROM monitor_services
|
||||
WHERE checked_at > NOW() - INTERVAL '24 hours'
|
||||
GROUP BY service_name, status;
|
||||
|
||||
-- 最近異常視圖
|
||||
CREATE OR REPLACE VIEW v_recent_anomalies AS
|
||||
SELECT
|
||||
anomaly_type,
|
||||
severity,
|
||||
username,
|
||||
source_ip,
|
||||
description,
|
||||
detected_at
|
||||
FROM monitor_anomalies
|
||||
WHERE detected_at > NOW() - INTERVAL '24 hours'
|
||||
ORDER BY detected_at DESC;
|
||||
|
||||
-- 閒置 Workflow 視圖
|
||||
CREATE OR REPLACE VIEW v_idle_workflows AS
|
||||
SELECT
|
||||
workflow_name,
|
||||
idle_days,
|
||||
suggestion,
|
||||
last_executed_at
|
||||
FROM monitor_workflows
|
||||
WHERE idle_days > 30 AND is_active = TRUE
|
||||
ORDER BY idle_days DESC;
|
||||
|
||||
-- 存儲使用概況視圖
|
||||
CREATE OR REPLACE VIEW v_storage_overview AS
|
||||
SELECT
|
||||
user_cluster,
|
||||
storage_tier,
|
||||
COUNT(*) as file_count,
|
||||
SUM(file_size) as total_size
|
||||
FROM file_registry
|
||||
WHERE status = 'active'
|
||||
GROUP BY user_cluster, storage_tier;
|
||||
|
||||
-- ============================================================
|
||||
-- 備份監控 (Layer 7 Extension)
|
||||
-- ============================================================
|
||||
|
||||
-- 備份註冊表
|
||||
CREATE TABLE IF NOT EXISTS backup_registry (
|
||||
id SERIAL PRIMARY KEY,
|
||||
service_name VARCHAR(50) NOT NULL,
|
||||
backup_file VARCHAR(500) NOT NULL,
|
||||
backup_size_bytes BIGINT,
|
||||
backup_type VARCHAR(20) CHECK (backup_type IN ('daily', 'weekly', 'monthly', 'archive', 'full', 'incremental')),
|
||||
backup_method VARCHAR(20) CHECK (backup_method IN ('pg_dump', 'mysqldump', 'tar', 'snapshot', 'dump')),
|
||||
status VARCHAR(20) CHECK (status IN ('pending', 'running', 'completed', 'failed', 'verified')),
|
||||
compression_ratio DECIMAL(5,2),
|
||||
verification_result BOOLEAN,
|
||||
error_message TEXT,
|
||||
started_at TIMESTAMP DEFAULT NOW(),
|
||||
completed_at TIMESTAMP,
|
||||
created_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- 備份存儲統計
|
||||
CREATE TABLE IF NOT EXISTS backup_storage_stats (
|
||||
id SERIAL PRIMARY KEY,
|
||||
tier VARCHAR(20) CHECK (tier IN ('daily', 'weekly', 'monthly', 'archive', 'total')),
|
||||
file_count BIGINT,
|
||||
total_size_bytes BIGINT,
|
||||
record_time TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- 備份歷史
|
||||
CREATE TABLE IF NOT EXISTS backup_history (
|
||||
id SERIAL PRIMARY KEY,
|
||||
service_name VARCHAR(50) NOT NULL,
|
||||
operation VARCHAR(20) CHECK (operation IN ('backup', 'restore', 'tier_migration', 'cleanup', 'verify')),
|
||||
backup_file VARCHAR(500),
|
||||
backup_tier VARCHAR(20),
|
||||
source_tier VARCHAR(20),
|
||||
dest_tier VARCHAR(20),
|
||||
file_count BIGINT,
|
||||
size_bytes BIGINT,
|
||||
duration_seconds INTEGER,
|
||||
status VARCHAR(20) CHECK (status IN ('success', 'failed', 'partial')),
|
||||
error_message TEXT,
|
||||
executed_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_backup_registry_service ON backup_registry(service_name);
|
||||
CREATE INDEX idx_backup_registry_time ON backup_registry(created_at);
|
||||
CREATE INDEX idx_backup_storage_stats_tier ON backup_storage_stats(tier);
|
||||
CREATE INDEX idx_backup_storage_stats_time ON backup_storage_stats(record_time);
|
||||
CREATE INDEX idx_backup_history_service ON backup_history(service_name);
|
||||
CREATE INDEX idx_backup_history_time ON backup_history(executed_at);
|
||||
|
||||
-- ============================================================
|
||||
-- Node.js 版本基線監控
|
||||
-- ============================================================
|
||||
|
||||
CREATE TABLE IF NOT EXISTS node_version_baseline (
|
||||
id SERIAL PRIMARY KEY,
|
||||
runtime_name VARCHAR(50) NOT NULL,
|
||||
required_version VARCHAR(20) NOT NULL,
|
||||
current_version VARCHAR(20),
|
||||
process_name VARCHAR(100),
|
||||
process_path TEXT,
|
||||
is_compliant BOOLEAN,
|
||||
locked_path VARCHAR(500),
|
||||
checked_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- Node.js 進程追蹤
|
||||
CREATE TABLE IF NOT EXISTS node_process_tracking (
|
||||
id SERIAL PRIMARY KEY,
|
||||
process_name VARCHAR(100) NOT NULL,
|
||||
pid INTEGER,
|
||||
command VARCHAR(500),
|
||||
node_version VARCHAR(20),
|
||||
is_managed BOOLEAN DEFAULT FALSE,
|
||||
started_at TIMESTAMP,
|
||||
checked_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- ============================================================
|
||||
-- Python 版本基線監控
|
||||
-- ============================================================
|
||||
|
||||
CREATE TABLE IF NOT EXISTS python_version_baseline (
|
||||
id SERIAL PRIMARY KEY,
|
||||
runtime_name VARCHAR(50) NOT NULL,
|
||||
required_version VARCHAR(20) NOT NULL,
|
||||
current_version VARCHAR(20),
|
||||
interpreter_path VARCHAR(500),
|
||||
is_compliant BOOLEAN,
|
||||
checked_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- Python 腳本追蹤
|
||||
CREATE TABLE IF NOT EXISTS python_script_tracking (
|
||||
id SERIAL PRIMARY KEY,
|
||||
script_path TEXT NOT NULL,
|
||||
shebang_version VARCHAR(20),
|
||||
actual_version VARCHAR(20),
|
||||
is_compliant BOOLEAN DEFAULT FALSE,
|
||||
last_run_at TIMESTAMP,
|
||||
exit_code INTEGER,
|
||||
error_output TEXT,
|
||||
checked_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_node_version_name ON node_version_baseline(runtime_name);
|
||||
CREATE INDEX idx_node_process_name ON node_process_tracking(process_name);
|
||||
CREATE INDEX idx_python_version_name ON python_version_baseline(runtime_name);
|
||||
CREATE INDEX idx_python_script_path ON python_script_tracking(script_path);
|
||||
175
monitor/portal/page_monitor.sh
Executable file
175
monitor/portal/page_monitor.sh
Executable file
@@ -0,0 +1,175 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry WordPress Portal 監控 (Layer 4)
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/portal/page_monitor.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
MONITOR_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/portal_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
# WordPress 配置
|
||||
WP_SITE="https://wp.momentry.ddns.net"
|
||||
WP_DB_HOST="localhost"
|
||||
WP_DB_NAME="wordpress"
|
||||
WP_DB_USER="wp_user"
|
||||
WP_DB_PASS="wp_password_123"
|
||||
|
||||
# 顏色
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m'
|
||||
|
||||
# 記錄頁面檢查結果
|
||||
record_page() {
|
||||
local url=$1
|
||||
local page_type=$2
|
||||
local accessible=$3
|
||||
local response_time=$4
|
||||
local http_status=$5
|
||||
local error=$6
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_portal_pages (page_url, page_type, is_accessible, response_time_ms, http_status, error_message, checked_at)
|
||||
VALUES ('$url', '$page_type', $accessible, $response_time, $http_status, '$error', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# 記錄用戶
|
||||
record_user() {
|
||||
local user_id=$1
|
||||
local username=$2
|
||||
local email=$3
|
||||
local role=$4
|
||||
local is_active=$5
|
||||
local last_login=$6
|
||||
local created_at=$7
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_portal_users (user_id, username, email, role, is_active, last_login, created_at, detected_at)
|
||||
VALUES ($user_id, '$username', '$email', '$role', $is_active, $last_login, $created_at, NOW())
|
||||
ON CONFLICT DO NOTHING;
|
||||
EOF
|
||||
}
|
||||
|
||||
# 記錄異常
|
||||
record_anomaly() {
|
||||
local anomaly_type=$1
|
||||
local severity=$2
|
||||
local username=$3
|
||||
local description=$4
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_anomalies (anomaly_type, severity, source_type, username, description, detected_at)
|
||||
VALUES ('$anomaly_type', '$severity', 'wordpress', '$username', '$description', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# 檢查頁面
|
||||
check_page() {
|
||||
local url=$1
|
||||
local page_type=$2
|
||||
|
||||
local start=$(date +%s%N)
|
||||
local http_code=$(curl -s -o /dev/null -w "%{http_code}" "$url" --max-time 10 -k -L 2>/dev/null || echo "000")
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
|
||||
if [ "$http_code" = "200" ]; then
|
||||
accessible="true"
|
||||
error=""
|
||||
echo -e "${GREEN}✓${NC} $page_type - ${ms}ms (HTTP $http_code)"
|
||||
else
|
||||
accessible="false"
|
||||
error="HTTP $http_code"
|
||||
echo -e "${RED}✗${NC} $page_type - HTTP $http_code"
|
||||
fi
|
||||
|
||||
record_page "$url" "$page_type" "$accessible" "$ms" "$http_code" "$error"
|
||||
}
|
||||
|
||||
# 檢查用戶
|
||||
check_users() {
|
||||
echo ""
|
||||
echo "WordPress 用戶檢查:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
# 獲取用戶列表
|
||||
users=$(mysql -u"$WP_DB_USER" -p"$WP_DB_PASS" -h "$WP_DB_HOST" "$WP_DB_NAME" -N -e "
|
||||
SELECT u.ID, u.user_login, u.user_email, u.user_registered, u.user_status,
|
||||
COALESCE(m.meta_value, 'subscriber') as role
|
||||
FROM wp_users u
|
||||
LEFT JOIN wp_usermeta m ON u.ID = m.user_id AND m.meta_key = 'wp_capabilities'
|
||||
ORDER BY u.ID;
|
||||
" 2>/dev/null)
|
||||
|
||||
if [ -z "$users" ]; then
|
||||
echo "無法連接 WordPress 資料庫"
|
||||
return 1
|
||||
fi
|
||||
|
||||
local admin_count=0
|
||||
local total_users=0
|
||||
|
||||
while IFS='|' read -r id login email registered status role; do
|
||||
[ -z "$id" ] && continue
|
||||
|
||||
total_users=$((total_users + 1))
|
||||
|
||||
# 判斷是否管理員
|
||||
if echo "$role" | grep -q "administrator"; then
|
||||
admin_count=$((admin_count + 1))
|
||||
role="administrator"
|
||||
elif echo "$role" | grep -q "editor"; then
|
||||
role="editor"
|
||||
elif echo "$role" | grep -q "author"; then
|
||||
role="author"
|
||||
elif echo "$role" | grep -q "contributor"; then
|
||||
role="contributor"
|
||||
else
|
||||
role="subscriber"
|
||||
fi
|
||||
|
||||
# 記錄用戶
|
||||
record_user "$id" "$login" "$email" "$role" "true" "NULL" "'$registered'"
|
||||
|
||||
echo " - $login ($role)"
|
||||
|
||||
done <<< "$users"
|
||||
|
||||
echo "----------------------------------------"
|
||||
echo "總用戶: $total_users | 管理員: $admin_count"
|
||||
}
|
||||
|
||||
# 主程序
|
||||
echo "========================================"
|
||||
echo "Layer 4: WordPress Portal Monitoring"
|
||||
echo "Time: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
echo "頁面可訪問性檢查:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
# 檢查首頁
|
||||
check_page "$WP_SITE/" "homepage"
|
||||
|
||||
# 檢查登入頁
|
||||
check_page "$WP_SITE/wp-login.php" "login_page"
|
||||
|
||||
# 檢查 wp-json API
|
||||
check_page "$WP_SITE/wp-json/" "api"
|
||||
|
||||
echo ""
|
||||
check_users
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
log "Portal check completed"
|
||||
93
monitor/service/external_monitor.sh
Executable file
93
monitor/service/external_monitor.sh
Executable file
@@ -0,0 +1,93 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry 外部監控 (Layer 1)
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/service/external_monitor.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/external_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
# 記錄結果
|
||||
record_external() {
|
||||
local target=$1
|
||||
local target_type=$2
|
||||
local reachable=$3
|
||||
local response_time=$4
|
||||
local error=$5
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_external (target_name, target_type, is_reachable, response_time_ms, error_message, checked_at)
|
||||
VALUES ('$target', '$target_type', $reachable, $response_time, '$error', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# 檢查 DDNS
|
||||
check_ddns() {
|
||||
local start=$(date +%s%N)
|
||||
local ip=$(dig +short momentry.ddns.net 2>/dev/null | tail -1)
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
|
||||
if [ -n "$ip" ]; then
|
||||
echo "✓ DDNS (momentry.ddns.net) -> $ip (${ms}ms)"
|
||||
record_external "ddns" "ddns" "true" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo "✗ DDNS (momentry.ddns.net) - DNS resolution failed"
|
||||
record_external "ddns" "ddns" "false" "0" "DNS resolution failed"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查網關
|
||||
check_gateway() {
|
||||
local start=$(date +%s%N)
|
||||
if ping -c 1 -W 2 192.168.110.1 > /dev/null 2>&1; then
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
echo "✓ Gateway (192.168.110.1) - ${ms}ms"
|
||||
record_external "gateway" "gateway" "true" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo "✗ Gateway (192.168.110.1) - Unreachable"
|
||||
record_external "gateway" "gateway" "false" "0" "Unreachable"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查互聯網
|
||||
check_internet() {
|
||||
local start=$(date +%s%N)
|
||||
if ping -c 1 -W 2 8.8.8.8 > /dev/null 2>&1; then
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
echo "✓ Internet (8.8.8.8) - ${ms}ms"
|
||||
record_external "internet" "internet" "true" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo "✗ Internet (8.8.8.8) - Unreachable"
|
||||
record_external "internet" "internet" "false" "0" "Unreachable"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 主程序
|
||||
echo "========================================"
|
||||
echo "Layer 1: External Monitoring"
|
||||
echo "Time: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
check_ddns
|
||||
check_gateway
|
||||
check_internet
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
log "External check completed"
|
||||
370
monitor/service/health_check.sh
Executable file
370
monitor/service/health_check.sh
Executable file
@@ -0,0 +1,370 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry 服務健康檢查 (Layer 2)
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/service/health_check.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
MONITOR_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/service_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
# 顏色
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m'
|
||||
|
||||
# 記錄結果到資料庫
|
||||
record_service() {
|
||||
local service=$1
|
||||
local status=$2
|
||||
local response_time=$3
|
||||
local error_msg=$4
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_services (service_name, service_type, status, response_time_ms, error_message, checked_at)
|
||||
VALUES ('$service', 'service', '$status', $response_time, '$error_msg', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# 檢查 PostgreSQL
|
||||
check_postgresql() {
|
||||
local start=$(date +%s%N)
|
||||
if pg_isready -h localhost -p 5432 -U accusys > /dev/null 2>&1; then
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
echo -e "${GREEN}✓${NC} PostgreSQL (5432) - ${ms}ms"
|
||||
record_service "postgresql" "up" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} PostgreSQL (5432) - Down"
|
||||
record_service "postgresql" "down" "0" "Connection failed"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 Redis
|
||||
check_redis() {
|
||||
local start=$(date +%s%N)
|
||||
if redis-cli -a accusys ping 2>/dev/null | grep -q "PONG"; then
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
echo -e "${GREEN}✓${NC} Redis (6379) - ${ms}ms"
|
||||
record_service "redis" "up" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} Redis (6379) - Down"
|
||||
record_service "redis" "down" "0" "Connection failed"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 MariaDB
|
||||
check_mariadb() {
|
||||
local start=$(date +%s%N)
|
||||
if mysql -u accusys -e "SELECT 1" > /dev/null 2>&1; then
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
echo -e "${GREEN}✓${NC} MariaDB (3306) - ${ms}ms"
|
||||
record_service "mariadb" "up" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} MariaDB (3306) - Down"
|
||||
record_service "mariadb" "down" "0" "Connection failed"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 n8n
|
||||
check_n8n() {
|
||||
local start=$(date +%s%N)
|
||||
local http_code=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8085/ --max-time 5)
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
|
||||
if [ "$http_code" = "200" ] || [ "$http_code" = "302" ]; then
|
||||
echo -e "${GREEN}✓${NC} n8n (8085) - ${ms}ms"
|
||||
record_service "n8n" "up" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} n8n (8085) - HTTP $http_code"
|
||||
record_service "n8n" "down" "0" "HTTP $http_code"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 Caddy
|
||||
check_caddy() {
|
||||
local start=$(date +%s%N)
|
||||
local http_code=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:2019/config/ --max-time 5)
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
|
||||
if [ "$http_code" = "200" ]; then
|
||||
echo -e "${GREEN}✓${NC} Caddy (2019) - ${ms}ms"
|
||||
record_service "caddy" "up" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} Caddy (2019) - HTTP $http_code"
|
||||
record_service "caddy" "down" "0" "HTTP $http_code"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 Gitea
|
||||
check_gitea() {
|
||||
local start=$(date +%s%N)
|
||||
local http_code=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:3000/ --max-time 5)
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
|
||||
if [ "$http_code" = "200" ]; then
|
||||
echo -e "${GREEN}✓${NC} Gitea (3000) - ${ms}ms"
|
||||
record_service "gitea" "up" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} Gitea (3000) - HTTP $http_code"
|
||||
record_service "gitea" "down" "0" "HTTP $http_code"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 SFTPGo
|
||||
check_sftpgo() {
|
||||
local start=$(date +%s%N)
|
||||
local http_code=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8080 --max-time 5)
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
|
||||
if [ "$http_code" = "200" ] || [ "$http_code" = "301" ] || [ "$http_code" = "302" ]; then
|
||||
echo -e "${GREEN}✓${NC} SFTPGo (8080) - ${ms}ms"
|
||||
record_service "sftpgo" "up" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} SFTPGo (8080) - HTTP $http_code"
|
||||
record_service "sftpgo" "down" "0" "HTTP $http_code"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 Ollama
|
||||
check_ollama() {
|
||||
local start=$(date +%s%N)
|
||||
local http_code=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:11434/api/tags --max-time 5)
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
|
||||
if [ "$http_code" = "200" ]; then
|
||||
echo -e "${GREEN}✓${NC} Ollama (11434) - ${ms}ms"
|
||||
record_service "ollama" "up" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} Ollama (11434) - HTTP $http_code"
|
||||
record_service "ollama" "down" "0" "HTTP $http_code"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 Qdrant
|
||||
check_qdrant() {
|
||||
local start=$(date +%s%N)
|
||||
local http_code=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:6333/collections --max-time 5)
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
|
||||
if [ "$http_code" = "200" ] || [ "$http_code" = "401" ]; then
|
||||
echo -e "${GREEN}✓${NC} Qdrant (6333) - ${ms}ms"
|
||||
record_service "qdrant" "up" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} Qdrant (6333) - HTTP $http_code"
|
||||
record_service "qdrant" "down" "0" "HTTP $http_code"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 MongoDB
|
||||
check_mongodb() {
|
||||
local start=$(date +%s%N)
|
||||
if mongosh --quiet --eval "db.adminCommand('ping')" > /dev/null 2>&1; then
|
||||
local end=$(date +%s%N)
|
||||
local ms=$(( (end - start) / 1000000 ))
|
||||
echo -e "${GREEN}✓${NC} MongoDB (27017) - ${ms}ms"
|
||||
record_service "mongodb" "up" "$ms" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} MongoDB (27017) - Down"
|
||||
record_service "mongodb" "down" "0" "Connection failed"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 PHP-FPM
|
||||
check_php() {
|
||||
if pgrep -f "php-fpm" > /dev/null 2>&1; then
|
||||
echo -e "${GREEN}✓${NC} PHP-FPM - Running"
|
||||
record_service "php" "up" "1" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗${NC} PHP-FPM - Not running"
|
||||
record_service "php" "down" "0" "Process not found"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 RustDesk
|
||||
check_rustdesk() {
|
||||
local hbbs_ok=false
|
||||
local hbbr_ok=false
|
||||
|
||||
if nc -z localhost 21116 > /dev/null 2>&1; then
|
||||
hbbs_ok=true
|
||||
fi
|
||||
|
||||
if nc -z localhost 21117 > /dev/null 2>&1; then
|
||||
hbbr_ok=true
|
||||
fi
|
||||
|
||||
if $hbbs_ok && $hbbr_ok; then
|
||||
echo -e "${GREEN}✓${NC} RustDesk (21116/21117) - Running"
|
||||
record_service "rustdesk" "up" "1" ""
|
||||
return 0
|
||||
else
|
||||
echo -e "${YELLOW}⚠${NC} RustDesk - Partial (hbbs: $hbbs_ok, hbbr: $hbbr_ok)"
|
||||
record_service "rustdesk" "degraded" "0" "hbbs:$hbbs_ok hbbr:$hbbr_ok"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 Node.js 版本
|
||||
check_node() {
|
||||
local LOCKED_NODE_VERSION="22"
|
||||
local version_issues=0
|
||||
|
||||
local node_pids=$(pgrep -f "n8n" 2>/dev/null)
|
||||
|
||||
if [ -z "$node_pids" ]; then
|
||||
echo -e "${YELLOW}⚠${NC} Node.js - n8n not running"
|
||||
record_service "node" "degraded" "1" "n8n not running"
|
||||
return 1
|
||||
fi
|
||||
|
||||
for pid in $node_pids; do
|
||||
local node_path=$(lsof -p $pid 2>/dev/null | grep "txt" | grep "node" | head -1 | awk '{print $NF}' | grep -v "dylib")
|
||||
|
||||
if [ -n "$node_path" ] && [ -f "$node_path" ]; then
|
||||
local node_version=$($node_path --version 2>/dev/null | sed 's/v//')
|
||||
local node_major=$(echo "$node_version" | cut -d. -f1)
|
||||
|
||||
if [ "$node_major" != "$LOCKED_NODE_VERSION" ]; then
|
||||
version_issues=$((version_issues + 1))
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
if [ $version_issues -gt 0 ]; then
|
||||
echo -e "${RED}✗${NC} Node.js - Version issues detected"
|
||||
record_service "node" "degraded" "1" "$version_issues version issues"
|
||||
return 1
|
||||
else
|
||||
echo -e "${GREEN}✓${NC} Node.js (${LOCKED_NODE_VERSION}.x) - Running"
|
||||
record_service "node" "up" "1" ""
|
||||
return 0
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢查 Python 版本
|
||||
check_python() {
|
||||
local LOCKED_PYTHON_VERSION="3.11.14"
|
||||
local script_issues=0
|
||||
|
||||
local scripts=(
|
||||
"/Users/accusys/momentry_core_0.1/scripts/asr_processor.py"
|
||||
"/Users/accusys/momentry_core_0.1/scripts/thumbnail_extractor.py"
|
||||
)
|
||||
|
||||
for script in "${scripts[@]}"; do
|
||||
if [ -f "$script" ]; then
|
||||
local shebang=$(head -1 "$script")
|
||||
|
||||
if [[ "$shebang" != *"python3.11"* ]]; then
|
||||
script_issues=$((script_issues + 1))
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
if [ $script_issues -gt 0 ]; then
|
||||
echo -e "${RED}✗${NC} Python - Script version issues"
|
||||
record_service "python" "degraded" "1" "$script_issues script issues"
|
||||
return 1
|
||||
else
|
||||
echo -e "${GREEN}✓${NC} Python (${LOCKED_PYTHON_VERSION}) - Configured"
|
||||
record_service "python" "up" "1" ""
|
||||
return 0
|
||||
fi
|
||||
}
|
||||
|
||||
# 主程序
|
||||
echo "========================================"
|
||||
echo "Layer 2: Service Health Check"
|
||||
echo "Time: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
total=0
|
||||
passed=0
|
||||
|
||||
total=$((total + 1))
|
||||
check_postgresql && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_redis && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_mariadb && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_n8n && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_caddy && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_gitea && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_sftpgo && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_ollama && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_qdrant && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_mongodb && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_php && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_rustdesk && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_node && passed=$((passed + 1))
|
||||
|
||||
total=$((total + 1))
|
||||
check_python && passed=$((passed + 1))
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
echo "Result: $passed / $total services healthy"
|
||||
echo "========================================"
|
||||
|
||||
log "Service check completed: $passed/$total healthy"
|
||||
270
monitor/service/node_monitor.sh
Executable file
270
monitor/service/node_monitor.sh
Executable file
@@ -0,0 +1,270 @@
|
||||
#!/bin/bash
|
||||
|
||||
#===============================================================================
|
||||
# Momentry Node.js 監控腳本
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/service/node_monitor.sh
|
||||
#
|
||||
# 監控重點:
|
||||
# - n8n 使用的 Node.js 版本鎖定 (22.x)
|
||||
# - 進程數量與狀態
|
||||
# - 資源使用情況
|
||||
#
|
||||
# 使用方式:
|
||||
# ./node_monitor.sh status # 顯示監控狀態
|
||||
# ./node_monitor.sh baseline # 建立版本基線
|
||||
# ./node_monitor.sh check # 檢查版本變化
|
||||
#===============================================================================
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
MONITOR_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/node_check.log"
|
||||
|
||||
# 顏色定義
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m'
|
||||
|
||||
# 鎖定的 Node.js 版本
|
||||
LOCKED_NODE_VERSION="22"
|
||||
LOCKED_NODE_MINOR="22"
|
||||
|
||||
#===============================================================================
|
||||
# 記錄函數
|
||||
#===============================================================================
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
log_success() {
|
||||
echo -e "${GREEN}[$(date '+%Y-%m-%d %H:%M:%S')] ✅ $1${NC}" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
log_error() {
|
||||
echo -e "${RED}[$(date '+%Y-%m-%d %H:%M:%S')] ❌ $1${NC}" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
log_warn() {
|
||||
echo -e "${YELLOW}[$(date '+%Y-%m-%d %H:%M:%S')] ⚠️ $1${NC}" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
#===============================================================================
|
||||
# 記錄到資料庫
|
||||
#===============================================================================
|
||||
record_node_baseline() {
|
||||
local runtime_name=$1
|
||||
local current_version=$2
|
||||
local process_path=$3
|
||||
local pid=$4
|
||||
|
||||
local required_version="${LOCKED_NODE_VERSION}.x"
|
||||
local is_compliant="false"
|
||||
if [[ "$current_version" == "${LOCKED_NODE_VERSION}".* ]]; then
|
||||
is_compliant="true"
|
||||
fi
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO node_version_baseline (runtime_name, required_version, current_version, process_name, process_path, is_compliant, locked_path, checked_at)
|
||||
VALUES ('$runtime_name', '$required_version', '$current_version', 'node', '$process_path', $is_compliant, '$process_path', NOW())
|
||||
ON CONFLICT DO NOTHING;
|
||||
EOF
|
||||
}
|
||||
|
||||
record_node_history() {
|
||||
local process_name=$1
|
||||
local old_version=$2
|
||||
local new_version=$3
|
||||
local old_path=$4
|
||||
local new_path=$5
|
||||
|
||||
# node_version_history table does not exist - skip recording
|
||||
true
|
||||
}
|
||||
|
||||
record_monitor_service() {
|
||||
local service=$1
|
||||
local status=$2
|
||||
local version=$3
|
||||
local error_msg=$4
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_services (service_name, service_type, status, response_time_ms, error_message, checked_at)
|
||||
VALUES ('$service', 'node', '$status', 0, '$version - $error_msg', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
#===============================================================================
|
||||
# 發現 Node.js 進程
|
||||
#===============================================================================
|
||||
discover_node_processes() {
|
||||
log "發現 Node.js 進程..."
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
echo "Node.js 監控狀態"
|
||||
echo "時間: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
# 獲取所有 node 進程
|
||||
local node_pids=$(pgrep -f "node" 2>/dev/null)
|
||||
|
||||
if [ -z "$node_pids" ]; then
|
||||
echo -e "${RED}沒有運行中的 Node.js 進程${NC}"
|
||||
record_monitor_service "node" "down" "N/A" "No processes"
|
||||
return 1
|
||||
fi
|
||||
|
||||
echo "鎖定版本: Node.js ${LOCKED_NODE_VERSION}.x (n8n 專用)"
|
||||
echo ""
|
||||
echo "----------------------------------------"
|
||||
echo "發現的 Node.js 進程:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
local total_processes=0
|
||||
local n8n_processes=0
|
||||
local version_issues=0
|
||||
|
||||
for pid in $node_pids; do
|
||||
# 獲取進程命令
|
||||
local cmd=$(ps -o args= -p $pid 2>/dev/null | head -1)
|
||||
|
||||
# 獲取 Node.js 版本
|
||||
local node_path=$(lsof -p $pid 2>/dev/null | grep "txt" | grep "node" | head -1 | awk '{print $NF}' | grep -v "dylib")
|
||||
|
||||
if [ -n "$node_path" ] && [ -f "$node_path" ]; then
|
||||
local node_version=$($node_path --version 2>/dev/null | sed 's/v//')
|
||||
local node_major=$(echo "$node_version" | cut -d. -f1)
|
||||
local node_minor=$(echo "$node_version" | cut -d. -f2)
|
||||
else
|
||||
local node_version="unknown"
|
||||
local node_major="unknown"
|
||||
fi
|
||||
|
||||
# 內存使用
|
||||
local mem=$(ps -o rss= -p $pid 2>/dev/null | awk '{print int($1/1024)}')
|
||||
|
||||
# CPU 使用
|
||||
local cpu=$(ps -o %cpu= -p $pid 2>/dev/null | awk '{print int($1)}')
|
||||
|
||||
# 運行時間
|
||||
local time=$(ps -o etime= -p $pid 2>/dev/null | tr -d ' ')
|
||||
|
||||
# 識別服務類型
|
||||
local service_type="other"
|
||||
if echo "$cmd" | grep -q "n8n"; then
|
||||
service_type="n8n"
|
||||
n8n_processes=$((n8n_processes + 1))
|
||||
elif echo "$cmd" | grep -q "worker"; then
|
||||
service_type="n8n-worker"
|
||||
n8n_processes=$((n8n_processes + 1))
|
||||
fi
|
||||
|
||||
# 版本檢查
|
||||
local version_status="✅"
|
||||
if [ "$service_type" = "n8n" ] || [ "$service_type" = "n8n-worker" ]; then
|
||||
if [ "$node_major" != "$LOCKED_NODE_VERSION" ]; then
|
||||
version_status="❌ 版本錯誤!"
|
||||
version_issues=$((version_issues + 1))
|
||||
log_error "n8n 使用 Node.js $node_version (應為 ${LOCKED_NODE_VERSION}.x)"
|
||||
fi
|
||||
fi
|
||||
|
||||
echo " PID: $pid"
|
||||
echo " 命令: ${cmd:0:60}..."
|
||||
echo " Node.js: $node_version $version_status"
|
||||
echo " 路徑: $node_path"
|
||||
echo " 內存: ${mem}MB | CPU: ${cpu}% | 運行: $time"
|
||||
echo " 類型: $service_type"
|
||||
echo ""
|
||||
|
||||
total_processes=$((total_processes + 1))
|
||||
|
||||
# 記錄基線
|
||||
record_node_baseline "$service_type" "$node_version" "$node_path" "$pid"
|
||||
done
|
||||
|
||||
echo "----------------------------------------"
|
||||
echo "總結:"
|
||||
echo " 總進程數: $total_processes"
|
||||
echo " n8n 相關: $n8n_processes"
|
||||
echo " 版本問題: $version_issues"
|
||||
echo "========================================"
|
||||
|
||||
# 記錄到資料庫
|
||||
if [ $version_issues -gt 0 ]; then
|
||||
record_monitor_service "node" "degraded" "${LOCKED_NODE_VERSION}.x" "$version_issues version issues"
|
||||
return 1
|
||||
else
|
||||
record_monitor_service "node" "up" "${LOCKED_NODE_VERSION}.x" "OK"
|
||||
return 0
|
||||
fi
|
||||
}
|
||||
|
||||
#===============================================================================
|
||||
# 版本基線檢查
|
||||
#===============================================================================
|
||||
check_baseline() {
|
||||
log "檢查 Node.js 版本基線..."
|
||||
|
||||
# 檢查 n8n 進程
|
||||
local n8n_pid=$(pgrep -f "n8n start" | head -1)
|
||||
|
||||
if [ -z "$n8n_pid" ]; then
|
||||
log_error "n8n 進程未運行"
|
||||
return 1
|
||||
fi
|
||||
|
||||
# 獲取 n8n 使用的 Node.js 版本
|
||||
local node_path=$(lsof -p $n8n_pid 2>/dev/null | grep "txt" | grep "node" | head -1 | awk '{print $NF}' | grep -v "dylib")
|
||||
|
||||
if [ -n "$node_path" ] && [ -f "$node_path" ]; then
|
||||
local node_version=$($node_path --version 2>/dev/null | sed 's/v//')
|
||||
local node_major=$(echo "$node_version" | cut -d. -f1)
|
||||
|
||||
echo "n8n 當前 Node.js 版本: $node_version"
|
||||
|
||||
if [ "$node_major" = "$LOCKED_NODE_VERSION" ]; then
|
||||
log_success "版本正確: Node.js $node_version"
|
||||
return 0
|
||||
else
|
||||
log_error "版本錯誤: Node.js $node_version (應為 ${LOCKED_NODE_VERSION}.x)"
|
||||
return 1
|
||||
fi
|
||||
else
|
||||
log_error "無法確定 Node.js 版本"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
#===============================================================================
|
||||
# 顯示狀態
|
||||
#===============================================================================
|
||||
show_status() {
|
||||
discover_node_processes
|
||||
}
|
||||
|
||||
#===============================================================================
|
||||
# 主程序
|
||||
#===============================================================================
|
||||
command=${1:-status}
|
||||
|
||||
case $command in
|
||||
status|check)
|
||||
show_status
|
||||
;;
|
||||
baseline)
|
||||
check_baseline
|
||||
;;
|
||||
*)
|
||||
echo "用法: $0 {status|baseline}"
|
||||
echo ""
|
||||
echo " status - 顯示 Node.js 監控狀態"
|
||||
echo " baseline - 檢查版本基線"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
281
monitor/service/python_monitor.sh
Executable file
281
monitor/service/python_monitor.sh
Executable file
@@ -0,0 +1,281 @@
|
||||
#!/bin/bash
|
||||
|
||||
#===============================================================================
|
||||
# Momentry Python 監控腳本
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/service/python_monitor.sh
|
||||
#
|
||||
# 監控重點:
|
||||
# - Momentry Python 腳本版本鎖定 (3.11.14)
|
||||
# - 進程數量與狀態
|
||||
# - 腳本執行狀態
|
||||
#
|
||||
# 使用方式:
|
||||
# ./python_monitor.sh status # 顯示監控狀態
|
||||
# ./python_monitor.sh baseline # 建立版本基線
|
||||
# ./python_monitor.sh check # 檢查版本變化
|
||||
#===============================================================================
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
MONITOR_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/python_check.log"
|
||||
|
||||
# 顏色定義
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m'
|
||||
|
||||
# 鎖定的 Python 版本
|
||||
LOCKED_PYTHON_VERSION="3.11.14"
|
||||
LOCKED_PYTHON_MAJOR="3"
|
||||
LOCKED_PYTHON_MINOR="11"
|
||||
|
||||
# Momentry Python 腳本
|
||||
MOMENTRY_SCRIPTS=(
|
||||
"/Users/accusys/momentry_core_0.1/scripts/asr_processor.py"
|
||||
"/Users/accusys/momentry_core_0.1/scripts/thumbnail_extractor.py"
|
||||
)
|
||||
|
||||
#===============================================================================
|
||||
# 記錄函數
|
||||
#===============================================================================
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
log_success() {
|
||||
echo -e "${GREEN}[$(date '+%Y-%m-%d %H:%M:%S')] ✅ $1${NC}" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
log_error() {
|
||||
echo -e "${RED}[$(date '+%Y-%m-%d %H:%M:%S')] ❌ $1${NC}" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
log_warn() {
|
||||
echo -e "${YELLOW}[$(date '+%Y-%m-%d %H:%M:%S')] ⚠️ $1${NC}" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
#===============================================================================
|
||||
# 記錄到資料庫
|
||||
#===============================================================================
|
||||
record_python_baseline() {
|
||||
local runtime_name=$1
|
||||
local current_version=$2
|
||||
local interpreter_path=$3
|
||||
|
||||
local required_version="${LOCKED_PYTHON_VERSION}"
|
||||
local is_compliant="false"
|
||||
if [[ "$current_version" == "${LOCKED_PYTHON_VERSION}" ]]; then
|
||||
is_compliant="true"
|
||||
fi
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO python_version_baseline (runtime_name, required_version, current_version, interpreter_path, is_compliant, checked_at)
|
||||
VALUES ('$runtime_name', '$required_version', '$current_version', '$interpreter_path', $is_compliant, NOW())
|
||||
ON CONFLICT DO NOTHING;
|
||||
EOF
|
||||
}
|
||||
|
||||
record_python_history() {
|
||||
local script_name=$1
|
||||
local old_version=$2
|
||||
local new_version=$3
|
||||
local old_path=$4
|
||||
local new_path=$5
|
||||
|
||||
# python_version_history table does not exist - skip recording
|
||||
true
|
||||
}
|
||||
|
||||
record_monitor_service() {
|
||||
local service=$1
|
||||
local status=$2
|
||||
local version=$3
|
||||
local error_msg=$4
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_services (service_name, service_type, status, response_time_ms, error_message, checked_at)
|
||||
VALUES ('$service', 'python', '$status', 0, '$version - $error_msg', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
#===============================================================================
|
||||
# 發現 Python 進程
|
||||
#===============================================================================
|
||||
discover_python_processes() {
|
||||
log "發現 Python 進程..."
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
echo "Python 監控狀態"
|
||||
echo "時間: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
echo "鎖定版本: Python ${LOCKED_PYTHON_VERSION} (Momentry 專用)"
|
||||
echo ""
|
||||
|
||||
# 檢查 Momentry 腳本
|
||||
echo "----------------------------------------"
|
||||
echo "Momentry Python 腳本:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
local script_issues=0
|
||||
|
||||
for script in "${MOMENTRY_SCRIPTS[@]}"; do
|
||||
if [ -f "$script" ]; then
|
||||
# 獲取腳本使用的 Python
|
||||
local shebang=$(head -1 "$script")
|
||||
local python_path=""
|
||||
|
||||
if [[ "$shebang" == *"/python3.11"* ]]; then
|
||||
python_path="/opt/homebrew/bin/python3.11"
|
||||
elif [[ "$shebang" == *"/python3"* ]]; then
|
||||
# 檢查系統 python3
|
||||
python_path=$(which python3 2>/dev/null)
|
||||
fi
|
||||
|
||||
if [ -n "$python_path" ] && [ -f "$python_path" ]; then
|
||||
local python_version=$($python_path --version 2>&1 | sed 's/Python //')
|
||||
local python_major=$(echo "$python_version" | cut -d. -f1)
|
||||
local python_minor=$(echo "$python_version" | cut -d. -f2)
|
||||
|
||||
# 檢查版本
|
||||
local version_status="✅"
|
||||
if [ "$python_major" = "$LOCKED_PYTHON_MAJOR" ] && [ "$python_minor" = "$LOCKED_PYTHON_MINOR" ]; then
|
||||
log_success "$(basename $script): $python_version"
|
||||
else
|
||||
version_status="❌ 版本錯誤!"
|
||||
script_issues=$((script_issues + 1))
|
||||
log_error "$(basename $script): $python_version (應為 ${LOCKED_PYTHON_VERSION})"
|
||||
fi
|
||||
|
||||
echo " $(basename $script)"
|
||||
echo " 路徑: $python_path"
|
||||
echo " 版本: $python_version $version_status"
|
||||
echo " shebang: $shebang"
|
||||
echo ""
|
||||
|
||||
# 記錄基線
|
||||
record_python_baseline "python_${LOCKED_PYTHON_VERSION}" "$python_version" "$python_path"
|
||||
else
|
||||
log_error "$(basename $script): 無法確定 Python 路徑"
|
||||
script_issues=$((script_issues + 1))
|
||||
fi
|
||||
else
|
||||
log_warn "$(basename $script): 文件不存在"
|
||||
script_issues=$((script_issues + 1))
|
||||
fi
|
||||
done
|
||||
|
||||
# 檢查運行中的 Python 進程
|
||||
echo "----------------------------------------"
|
||||
echo "運行中的 Python 進程:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
local python_pids=$(pgrep -f "python" 2>/dev/null)
|
||||
local total_processes=0
|
||||
|
||||
if [ -n "$python_pids" ]; then
|
||||
for pid in $python_pids; do
|
||||
# 獲取進程命令
|
||||
local cmd=$(ps -o args= -p $pid 2>/dev/null | head -1 | cut -c1-80)
|
||||
|
||||
# 獲取 Python 路徑
|
||||
local python_path=$(lsof -p $pid 2>/dev/null | grep "txt" | grep "Python" | head -1 | awk '{print $NF}' | grep -v "dylib")
|
||||
|
||||
if [ -n "$python_path" ] && [ -f "$python_path" ]; then
|
||||
local python_version=$($python_path --version 2>&1 | sed 's/Python //')
|
||||
else
|
||||
local python_version="unknown"
|
||||
fi
|
||||
|
||||
# 內存使用
|
||||
local mem=$(ps -o rss= -p $pid 2>/dev/null | awk '{print int($1/1024)}')
|
||||
|
||||
echo " PID $pid: $cmd"
|
||||
echo " Python: $python_version"
|
||||
echo " 內存: ${mem}MB"
|
||||
echo ""
|
||||
|
||||
total_processes=$((total_processes + 1))
|
||||
done
|
||||
else
|
||||
echo " (無運行中的 Python 進程)"
|
||||
fi
|
||||
|
||||
echo "----------------------------------------"
|
||||
echo "總結:"
|
||||
echo " 總進程數: $total_processes"
|
||||
echo " 腳本問題: $script_issues"
|
||||
echo "========================================"
|
||||
|
||||
# 記錄到資料庫
|
||||
if [ $script_issues -gt 0 ]; then
|
||||
record_monitor_service "python" "degraded" "${LOCKED_PYTHON_VERSION}" "$script_issues issues"
|
||||
return 1
|
||||
else
|
||||
record_monitor_service "python" "up" "${LOCKED_PYTHON_VERSION}" "OK"
|
||||
return 0
|
||||
fi
|
||||
}
|
||||
|
||||
#===============================================================================
|
||||
# 版本基線檢查
|
||||
#===============================================================================
|
||||
check_baseline() {
|
||||
log "檢查 Python 版本基線..."
|
||||
|
||||
local script_issues=0
|
||||
|
||||
for script in "${MOMENTRY_SCRIPTS[@]}"; do
|
||||
if [ -f "$script" ]; then
|
||||
local shebang=$(head -1 "$script")
|
||||
|
||||
if [[ "$shebang" == *"/python3.11"* ]]; then
|
||||
log_success "$(basename $script): 使用正確版本"
|
||||
else
|
||||
log_error "$(basename $script): 未使用 python3.11"
|
||||
script_issues=$((script_issues + 1))
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
if [ $script_issues -gt 0 ]; then
|
||||
return 1
|
||||
else
|
||||
return 0
|
||||
fi
|
||||
}
|
||||
|
||||
#===============================================================================
|
||||
# 顯示狀態
|
||||
#===============================================================================
|
||||
show_status() {
|
||||
discover_python_processes
|
||||
}
|
||||
|
||||
#===============================================================================
|
||||
# 主程序
|
||||
#===============================================================================
|
||||
command=${1:-status}
|
||||
|
||||
case $command in
|
||||
status|check)
|
||||
show_status
|
||||
;;
|
||||
baseline)
|
||||
check_baseline
|
||||
;;
|
||||
*)
|
||||
echo "用法: $0 {status|baseline}"
|
||||
echo ""
|
||||
echo " status - 顯示 Python 監控狀態"
|
||||
echo " baseline - 檢查版本基線"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
375
monitor/storage/backup_monitor.sh
Executable file
375
monitor/storage/backup_monitor.sh
Executable file
@@ -0,0 +1,375 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry 備份監控與溫冷轉移
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/storage/backup_monitor.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/backup_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
# 備份根目錄
|
||||
BACKUP_BASE="/Users/accusys/momentry/backup"
|
||||
|
||||
# 服務列表
|
||||
SERVICES=("postgresql" "redis" "mariadb" "n8n" "qdrant" "gitea" "ollama" "caddy" "mongodb" "sftpgo" "php")
|
||||
|
||||
# 溫冷分層配置
|
||||
TIER_HOT=7 # 7天內 - 快速存儲
|
||||
TIER_WARM=30 # 7-30天 - 標準存儲
|
||||
TIER_COLD=90 # 30-90天 - 低成本存儲
|
||||
TIER_ARCHIVE=365 # >90天 - 歸檔
|
||||
|
||||
# 記錄備份元數據
|
||||
record_backup() {
|
||||
local service=$1
|
||||
local backup_file=$2
|
||||
local backup_size=$3
|
||||
local backup_type=$4
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO backup_registry (service_name, backup_file, backup_size_bytes, backup_type, status, created_at)
|
||||
VALUES ('$service', '$backup_file', $backup_size, '$backup_type', 'completed', NOW())
|
||||
ON CONFLICT DO NOTHING;
|
||||
EOF
|
||||
}
|
||||
|
||||
# 記錄備份存儲統計
|
||||
record_backup_stats() {
|
||||
local tier=$1
|
||||
local file_count=$2
|
||||
local total_size=$3
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO backup_storage_stats (tier, file_count, total_size_bytes, record_time)
|
||||
VALUES ('$tier', $file_count, $total_size, NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# 初始化備份目錄結構
|
||||
init_backup_dirs() {
|
||||
log "初始化備份目錄結構..."
|
||||
|
||||
mkdir -p "$BACKUP_BASE"/{daily,weekly,monthly,archive}
|
||||
|
||||
for service in "${SERVICES[@]}"; do
|
||||
mkdir -p "$BACKUP_BASE/daily/$service"
|
||||
mkdir -p "$BACKUP_BASE/weekly/$service"
|
||||
mkdir -p "$BACKUP_BASE/monthly/$service"
|
||||
done
|
||||
|
||||
log "備份目錄結構已初始化"
|
||||
}
|
||||
|
||||
# 檢查備份狀態
|
||||
check_backup_status() {
|
||||
log "=== 檢查備份狀態 ==="
|
||||
|
||||
# 命名規範: {service}_{type}_{YYYYMMDD}_{HHMMSS}.{ext}
|
||||
# 例如: postgresql_db_20260315_030000.sql.gz
|
||||
|
||||
local total_backup_size=0
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
echo "備份監控狀態"
|
||||
echo "時間: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
echo "命名規範: {service}_{type}_{YYYYMMDD}_{HHMMSS}.{ext}"
|
||||
echo ""
|
||||
|
||||
for service in "${SERVICES[@]}"; do
|
||||
service_backup_dir="$BACKUP_BASE/daily/$service"
|
||||
|
||||
if [ -d "$service_backup_dir" ]; then
|
||||
file_count=$(find "$service_backup_dir" -type f 2>/dev/null | wc -l)
|
||||
size=$(du -sb "$service_backup_dir" 2>/dev/null | cut -f1)
|
||||
latest_file=$(find "$service_backup_dir" -type f \( -name "*.tar.gz" -o -name "*.sql.gz" -o -name "*.rdb" \) 2>/dev/null | head -1)
|
||||
|
||||
# 處理 size 為空或 0 的情況
|
||||
if [ -z "$size" ] || [ "$size" = "0" ]; then
|
||||
size=$(find "$service_backup_dir" -type f -exec ls -l {} \; 2>/dev/null | awk '{sum+=$5} END {print sum}')
|
||||
fi
|
||||
|
||||
size_str="0B"
|
||||
if [ -n "$size" ] && [ "$size" -gt 0 ]; then
|
||||
if [ "$size" -gt 1073741824 ]; then
|
||||
size_str="$((size / 1073741824))GB"
|
||||
elif [ "$size" -gt 1048576 ]; then
|
||||
size_str="$((size / 1048576))MB"
|
||||
elif [ "$size" -gt 1024 ]; then
|
||||
size_str="$((size / 1024))KB"
|
||||
else
|
||||
size_str="${size}B"
|
||||
fi
|
||||
fi
|
||||
|
||||
# 檢查最近備份時間 (使用文件名中的時間戳)
|
||||
days_since_backup=0
|
||||
today=$(date +%Y%m%d)
|
||||
|
||||
if [ -n "$latest_file" ]; then
|
||||
# 從文件名提取日期
|
||||
file_date=$(echo "$latest_file" | sed 's/.*\([0-9]\{8\}\).*/\1/')
|
||||
if [ -n "$file_date" ] && [ "$file_date" = "$today" ]; then
|
||||
days_since_backup=0
|
||||
else
|
||||
days_since_backup=1
|
||||
fi
|
||||
fi
|
||||
|
||||
# 狀態指示
|
||||
if [ "$days_since_backup" -eq 0 ]; then
|
||||
status="✅ 今日已備份"
|
||||
elif [ "$days_since_backup" -le 1 ]; then
|
||||
status="⚠️ 昨日已備份"
|
||||
elif [ "$days_since_backup" -le 7 ]; then
|
||||
status="⚠️ ${days_since_backup}天前"
|
||||
else
|
||||
status="❌ 超過${days_since_backup}天未備份!"
|
||||
fi
|
||||
|
||||
echo " $service: $file_count 個文件, $size_str | $status"
|
||||
|
||||
[ -n "$size" ] && total_backup_size=$((total_backup_size + size))
|
||||
|
||||
# 記錄到資料庫
|
||||
[ -n "$size" ] && record_backup "$service" "$service_backup_dir" "$size" "daily"
|
||||
else
|
||||
echo " $service: ❌ 備份目錄不存在"
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
echo "----------------------------------------"
|
||||
echo "存儲分層:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
for tier in daily weekly monthly archive; do
|
||||
tier_path="$BACKUP_BASE/$tier"
|
||||
if [ -d "$tier_path" ]; then
|
||||
file_count=$(find "$tier_path" -type f 2>/dev/null | wc -l)
|
||||
size=$(du -sb "$tier_path" 2>/dev/null | cut -f1)
|
||||
|
||||
tier_size_str="0B"
|
||||
if [ -n "$size" ] && [ "$size" -gt 0 ] 2>/dev/null; then
|
||||
if [ "$size" -gt 1073741824 ]; then
|
||||
tier_size_str="$((size / 1073741824))GB"
|
||||
elif [ "$size" -gt 1048576 ]; then
|
||||
tier_size_str="$((size / 1048576))MB"
|
||||
else
|
||||
tier_size_str="$((size / 1024))KB"
|
||||
fi
|
||||
fi
|
||||
|
||||
echo " $tier: $file_count 個文件, $tier_size_str"
|
||||
[ -n "$size" ] && record_backup_stats "$tier" "$file_count" "$size"
|
||||
fi
|
||||
done
|
||||
|
||||
total_size_str="0B"
|
||||
if [ "$total_backup_size" -gt 1073741824 ]; then
|
||||
total_size_str="$((total_backup_size / 1073741824))GB"
|
||||
elif [ "$total_backup_size" -gt 1048576 ]; then
|
||||
total_size_str="$((total_backup_size / 1048576))MB"
|
||||
elif [ "$total_backup_size" -gt 1024 ]; then
|
||||
total_size_str="$((total_backup_size / 1024))KB"
|
||||
else
|
||||
total_size_str="${total_backup_size}B"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "----------------------------------------"
|
||||
echo "總計: ${total_backup_size} bytes ($total_size_str)"
|
||||
echo "========================================"
|
||||
|
||||
# 記錄總計
|
||||
record_backup_stats "total" 0 "$total_backup_size"
|
||||
}
|
||||
|
||||
# 溫冷轉移 - 將舊備份移動到低成本存儲
|
||||
tier_backups() {
|
||||
log "執行溫冷轉移..."
|
||||
|
||||
local moved_count=0
|
||||
|
||||
# 7天前: daily -> weekly
|
||||
# 命名格式: {service}_{type}_{YYYYMMDD}_{HHMMSS}.{ext}
|
||||
find "$BACKUP_BASE/daily" -type f -mtime +7 | while read -r file; do
|
||||
service=$(basename "$(dirname "$file")")
|
||||
|
||||
# 解析時間戳
|
||||
filename=$(basename "$file")
|
||||
timestamp=$(echo "$filename" | grep -oP '\d{8}_\d{6}' || echo "")
|
||||
|
||||
if [ -n "$timestamp" ]; then
|
||||
year=${timestamp:0:4}
|
||||
week=$(date -j -f "%Y%m%d_%H%M%S" "${timestamp}_0000" +%Y-W%V 2>/dev/null || echo "$year-W$(date +%V)")
|
||||
else
|
||||
week=$(date +%Y-W%V)
|
||||
fi
|
||||
|
||||
dest_dir="$BACKUP_BASE/weekly/$service/$week"
|
||||
mkdir -p "$dest_dir"
|
||||
|
||||
mv "$file" "$dest_dir/" 2>/dev/null && log "移動: $file -> $dest_dir" && moved_count=$((moved_count + 1))
|
||||
done
|
||||
|
||||
# 30天前: weekly -> monthly
|
||||
find "$BACKUP_BASE/weekly" -type f -mtime +30 | while read -r file; do
|
||||
service=$(basename "$(dirname "$(dirname "$file")")")
|
||||
month=$(date +%Y-%m)
|
||||
|
||||
dest_dir="$BACKUP_BASE/monthly/$service/$month"
|
||||
mkdir -p "$dest_dir"
|
||||
|
||||
mv "$file" "$dest_dir/" 2>/dev/null && log "移動: $file -> $dest_dir" && moved_count=$((moved_count + 1))
|
||||
done
|
||||
|
||||
# 90天前: monthly -> archive (長期歸檔)
|
||||
find "$BACKUP_BASE/monthly" -type f -mtime +90 | while read -r file; do
|
||||
service=$(basename "$(dirname "$(dirname "$file")")")
|
||||
year=$(date +%Y)
|
||||
|
||||
dest_dir="$BACKUP_BASE/archive/$service/$year"
|
||||
mkdir -p "$dest_dir"
|
||||
|
||||
mv "$file" "$dest_dir/" 2>/dev/null && log "歸檔: $file -> $dest_dir" && moved_count=$((moved_count + 1))
|
||||
done
|
||||
|
||||
log "溫冷轉移完成: 移動了 $moved_count 個文件"
|
||||
}
|
||||
|
||||
# 清理過期備份
|
||||
cleanup_old() {
|
||||
log "清理過期備份..."
|
||||
|
||||
# 歸檔超過 365 天
|
||||
find "$BACKUP_BASE/archive" -type f -mtime +365 -delete 2>/dev/null
|
||||
|
||||
# 每月備份保留 12 個月
|
||||
find "$BACKUP_BASE/monthly" -type f -mtime +365 -delete 2>/dev/null
|
||||
|
||||
# 每週備份保留 12 週
|
||||
find "$BACKUP_BASE/weekly" -type f -mtime +84 -delete 2>/dev/null
|
||||
|
||||
# 每日備份保留 30 天
|
||||
find "$BACKUP_BASE/daily" -type f -mtime +30 -delete 2>/dev/null
|
||||
|
||||
log "清理完成"
|
||||
}
|
||||
|
||||
# 驗證備份完整性
|
||||
verify_backup() {
|
||||
local backup_file=$1
|
||||
|
||||
if [[ "$backup_file" == *.tar.gz ]]; then
|
||||
tar -tzf "$backup_file" > /dev/null 2>&1
|
||||
return $?
|
||||
elif [[ "$backup_file" == *.sql ]]; then
|
||||
head -1 "$backup_file" | grep -q "SQL" && return 0
|
||||
return 1
|
||||
elif [[ "$backup_file" == *.rdb ]]; then
|
||||
file "$backup_file" | grep -q "data" && return 0
|
||||
return 1
|
||||
fi
|
||||
|
||||
return 0
|
||||
}
|
||||
|
||||
# 生成備份報告
|
||||
generate_report() {
|
||||
local report_file="/Users/accusys/momentry/log/backup_report_$(date +%Y%m%d).txt"
|
||||
|
||||
{
|
||||
echo "========================================"
|
||||
echo "Momentry 備份報告"
|
||||
echo "生成時間: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
echo "## 備份狀態"
|
||||
check_backup_status
|
||||
|
||||
echo ""
|
||||
echo "## 存儲使用趨勢 (最近30天)"
|
||||
psql -U accusys -h localhost -d momentry -t -A -c "
|
||||
SELECT tier,
|
||||
COUNT(*) as files,
|
||||
AVG(total_size_bytes)::bigint as avg_size,
|
||||
MAX(total_size_bytes)::bigint as max_size
|
||||
FROM backup_storage_stats
|
||||
WHERE record_time > NOW() - INTERVAL '30 days'
|
||||
GROUP BY tier
|
||||
ORDER BY tier;
|
||||
" 2>/dev/null || echo " (無數據)"
|
||||
|
||||
echo ""
|
||||
echo "## 建議"
|
||||
|
||||
# 檢查是否有服務超過7天未備份
|
||||
for service in "${SERVICES[@]}"; do
|
||||
latest=$(find "$BACKUP_BASE/daily/$service" -type f 2>/dev/null | head -1)
|
||||
if [ -n "$latest" ]; then
|
||||
days_old=$(($(date +%s) - $(stat -f "%m" "$latest" 2>/dev/null || echo "0")) / 86400)
|
||||
if [ "$days_old" -gt 7 ]; then
|
||||
echo " - ⚠️ $service 超過 $days_old 天未備份,建議立即執行備份"
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
} > "$report_file"
|
||||
|
||||
log "報告已生成: $report_file"
|
||||
echo "$report_file"
|
||||
}
|
||||
|
||||
# 主程序
|
||||
command=${1:-status}
|
||||
|
||||
case $command in
|
||||
status)
|
||||
check_backup_status
|
||||
;;
|
||||
init)
|
||||
init_backup_dirs
|
||||
;;
|
||||
tier)
|
||||
tier_backups
|
||||
;;
|
||||
cleanup)
|
||||
cleanup_old
|
||||
;;
|
||||
verify)
|
||||
verify_backup "${2:-}"
|
||||
;;
|
||||
report)
|
||||
generate_report
|
||||
;;
|
||||
all)
|
||||
log "執行完整備份維護..."
|
||||
check_backup_status
|
||||
tier_backups
|
||||
cleanup_old
|
||||
generate_report
|
||||
log "備份維護完成"
|
||||
;;
|
||||
*)
|
||||
echo "用法: $0 {status|init|tier|cleanup|verify|report|all}"
|
||||
echo ""
|
||||
echo " status - 檢查備份狀態"
|
||||
echo " init - 初始化備份目錄"
|
||||
echo " tier - 執行溫冷轉移"
|
||||
echo " cleanup - 清理過期備份"
|
||||
echo " verify - 驗證備份完整性"
|
||||
echo " report - 生成備份報告"
|
||||
echo " all - 執行所有維護任務"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
193
monitor/storage/storage_manager.sh
Executable file
193
monitor/storage/storage_manager.sh
Executable file
@@ -0,0 +1,193 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry Storage 管理 (Layer 7)
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/storage/storage_manager.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/storage_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
# 存儲路徑配置
|
||||
STORAGE_BASE="/Users/accusys/momentry"
|
||||
DATA_DIR="$STORAGE_BASE/data"
|
||||
TEMP_DIR="$STORAGE_BASE/tmp"
|
||||
BACKUP_DIR="$STORAGE_BASE/backup"
|
||||
|
||||
# 用戶集群
|
||||
CLUSTERS=("family" "work" "wordpress" "shared")
|
||||
|
||||
# 記錄使用統計
|
||||
record_usage() {
|
||||
local cluster=$1
|
||||
local tier=$2
|
||||
local file_count=$3
|
||||
local total_size=$4
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO storage_usage_stats (user_cluster, storage_tier, file_count, total_size_bytes, record_time)
|
||||
VALUES ('$cluster', '$tier', $file_count, $total_size, NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# 記錄檔案註冊
|
||||
register_file() {
|
||||
local file_path=$1
|
||||
local user_cluster=$2
|
||||
local file_size=$3
|
||||
|
||||
file_name=$(basename "$file_path")
|
||||
file_hash=$(echo "$file_path" | md5sum | cut -d' ' -f1)
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO file_registry (file_name, file_path, file_path_hash, file_size, user_cluster, storage_tier, status, created_at)
|
||||
VALUES ('$file_name', '$file_path', '$file_hash', $file_size, '$user_cluster', 'hot', 'active', NOW())
|
||||
ON CONFLICT (file_path_hash) DO UPDATE SET
|
||||
last_accessed_at = NOW(),
|
||||
access_count = file_registry.access_count + 1;
|
||||
EOF
|
||||
}
|
||||
|
||||
# 初始化目錄結構
|
||||
init_directories() {
|
||||
echo "初始化目錄結構..."
|
||||
|
||||
# 主目錄
|
||||
mkdir -p "$DATA_DIR"
|
||||
mkdir -p "$TEMP_DIR"
|
||||
mkdir -p "$BACKUP_DIR"/{daily,weekly,monthly,archive}
|
||||
|
||||
# 用戶集群目錄
|
||||
for cluster in "${CLUSTERS[@]}"; do
|
||||
mkdir -p "$DATA_DIR/$cluster"
|
||||
done
|
||||
|
||||
# 臨時子目錄
|
||||
mkdir -p "$TEMP_DIR"/{upload,processing,cache,session}
|
||||
|
||||
echo "目錄結構已初始化"
|
||||
}
|
||||
|
||||
# 顯示存儲狀態
|
||||
show_status() {
|
||||
echo "========================================"
|
||||
echo "Layer 7: Storage Status"
|
||||
echo "Time: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
echo "存儲路徑:"
|
||||
echo " 數據: $DATA_DIR"
|
||||
echo " 臨時: $TEMP_DIR"
|
||||
echo " 備份: $BACKUP_DIR"
|
||||
echo ""
|
||||
|
||||
echo "用戶集群:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
for cluster in "${CLUSTERS[@]}"; do
|
||||
cluster_path="$DATA_DIR/$cluster"
|
||||
if [ -d "$cluster_path" ]; then
|
||||
file_count=$(find "$cluster_path" -type f 2>/dev/null | wc -l)
|
||||
total_size=$(du -sb "$cluster_path" 2>/dev/null | cut -f1)
|
||||
|
||||
size_str="0B"
|
||||
if [ -n "$total_size" ] && [ "$total_size" -gt 0 ] 2>/dev/null; then
|
||||
if [ "$total_size" -gt 1073741824 ]; then
|
||||
size_str="$((total_size / 1073741824))GB"
|
||||
elif [ "$total_size" -gt 1048576 ]; then
|
||||
size_str="$((total_size / 1048576))MB"
|
||||
elif [ "$total_size" -gt 1024 ]; then
|
||||
size_str="$((total_size / 1024))KB"
|
||||
else
|
||||
size_str="${total_size}B"
|
||||
fi
|
||||
fi
|
||||
|
||||
echo " $cluster: $file_count files, $size_str"
|
||||
[ -n "$total_size" ] && record_usage "$cluster" "hot" "$file_count" "$total_size"
|
||||
else
|
||||
echo " $cluster: (未創建)"
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
echo "臨時文件:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
for subdir in upload processing cache session; do
|
||||
subdir_path="$TEMP_DIR/$subdir"
|
||||
if [ -d "$subdir_path" ]; then
|
||||
file_count=$(find "$subdir_path" -type f 2>/dev/null | wc -l)
|
||||
size=$(du -sb "$subdir_path" 2>/dev/null | cut -f1)
|
||||
echo " $subdir: $file_count files"
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
echo "備份目錄:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
for subdir in daily weekly monthly archive; do
|
||||
subdir_path="$BACKUP_DIR/$subdir"
|
||||
if [ -d "$subdir_path" ]; then
|
||||
file_count=$(find "$subdir_path" -type f 2>/dev/null | wc -l)
|
||||
size=$(du -sb "$subdir_path" 2>/dev/null | cut -f1)
|
||||
echo " $subdir: $file_count files"
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
|
||||
# 顯示資料庫統計
|
||||
echo ""
|
||||
echo "資料庫統計 (file_registry):"
|
||||
psql -U accusys -h localhost -d momentry -t -A -c "
|
||||
SELECT user_cluster, storage_tier, COUNT(*) as files,
|
||||
SUM(file_size)::bigint as total_size
|
||||
FROM file_registry
|
||||
WHERE status = 'active'
|
||||
GROUP BY user_cluster, storage_tier;
|
||||
" 2>/dev/null || echo " (表未初始化)"
|
||||
}
|
||||
|
||||
# 清理臨時文件
|
||||
clean_temp() {
|
||||
echo "清理臨時文件..."
|
||||
|
||||
# 清理超過 7 天的上傳
|
||||
find "$TEMP_DIR/upload" -type f -mtime +7 -delete 2>/dev/null
|
||||
|
||||
# 清理超過 30 天的緩存
|
||||
find "$TEMP_DIR/cache" -type f -mtime +30 -delete 2>/dev/null
|
||||
|
||||
# 清理超過 7 天的處理中
|
||||
find "$TEMP_DIR/processing" -type f -mtime +7 -delete 2>/dev/null
|
||||
|
||||
echo "清理完成"
|
||||
}
|
||||
|
||||
# 主程序
|
||||
command=${1:-status}
|
||||
|
||||
case $command in
|
||||
status)
|
||||
show_status
|
||||
;;
|
||||
init)
|
||||
init_directories
|
||||
;;
|
||||
clean)
|
||||
clean_temp
|
||||
;;
|
||||
*)
|
||||
echo "用法: $0 {status|init|clean}"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
163
monitor/users/session_tracker.sh
Executable file
163
monitor/users/session_tracker.sh
Executable file
@@ -0,0 +1,163 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry 使用者會話追蹤 (Layer 6)
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/users/session_tracker.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/session_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
# 記錄會話
|
||||
record_session() {
|
||||
local session_type=$1
|
||||
local service=$2
|
||||
local username=$3
|
||||
local source_ip=$4
|
||||
local status=$5
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_sessions (session_type, service_name, username, source_ip, connected_at, status)
|
||||
VALUES ('$session_type', '$service', '$username', '$source_ip', NOW(), '$status');
|
||||
EOF
|
||||
}
|
||||
|
||||
# 記錄登入
|
||||
record_login() {
|
||||
local user_type=$1
|
||||
local username=$2
|
||||
local source_ip=$3
|
||||
local success=$4
|
||||
local method=$5
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_logins (user_type, username, source_ip, success, login_method, login_at)
|
||||
VALUES ('$user_type', '$username', '$source_ip', $success, '$method', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# 記錄異常
|
||||
record_anomaly() {
|
||||
local anomaly_type=$1
|
||||
local severity=$2
|
||||
local username=$3
|
||||
local source_ip=$4
|
||||
local description=$5
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_anomalies (anomaly_type, severity, source_type, username, source_ip, description, detected_at)
|
||||
VALUES ('$anomaly_type', '$severity', 'system', '$username', '$source_ip', '$description', NOW());
|
||||
EOF
|
||||
}
|
||||
|
||||
# SSH 會話
|
||||
track_ssh() {
|
||||
echo "SSH 會話:"
|
||||
|
||||
# 獲取當前 SSH 連線
|
||||
who | grep -E "pts|tty" | while read -r line; do
|
||||
user=$(echo "$line" | awk '{print $1}')
|
||||
tty=$(echo "$line" | awk '{print $2}')
|
||||
login_time=$(echo "$line" | awk '{print $3,$4}')
|
||||
ip=$(echo "$line" | awk '{print $NF}' | tr -d '()')
|
||||
|
||||
if [ -n "$ip" ] && [ "$ip" != "-" ]; then
|
||||
echo " - $user @ $ip (tty $tty) 登入時間: $login_time"
|
||||
record_session "ssh" "sshd" "$user" "$ip" "active"
|
||||
fi
|
||||
done
|
||||
|
||||
# 檢查 SSH 登入失敗
|
||||
echo ""
|
||||
echo "SSH 登入失敗 (最近 5 分鐘):"
|
||||
last -5 -f /var/log/auth.log 2>/dev/null | grep -i "failed password" | tail -5 | while read -r line; do
|
||||
user=$(echo "$line" | awk '{print $9}')
|
||||
ip=$(echo "$line" | grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | tail -1)
|
||||
|
||||
if [ -n "$ip" ]; then
|
||||
echo " - Failed: $user from $ip"
|
||||
record_login "system" "$user" "$ip" "false" "ssh"
|
||||
fi
|
||||
done
|
||||
}
|
||||
|
||||
# Web 服務會話
|
||||
track_web() {
|
||||
echo ""
|
||||
echo "Web 服務:"
|
||||
|
||||
# n8n 活躍會話 (如果有認證)
|
||||
n8n_sessions=0
|
||||
echo " - n8n: 檢查中... (需要 API key)"
|
||||
|
||||
# Gitea 活躍會話
|
||||
gitea_sessions=0
|
||||
echo " - Gitea: 檢查中... (需要登入)"
|
||||
}
|
||||
|
||||
# 資料庫連線
|
||||
track_database() {
|
||||
echo ""
|
||||
echo "資料庫連線:"
|
||||
|
||||
# PostgreSQL
|
||||
pg_conn=$(psql -U accusys -h localhost -t -A -c "SELECT count(*) FROM pg_stat_activity WHERE datname = 'momentry';" 2>/dev/null || echo "0")
|
||||
echo " - PostgreSQL: $pg_conn connections"
|
||||
|
||||
# Redis
|
||||
redis_conn=$(redis-cli -a accusys INFO clients 2>/dev/null | grep "connected_clients" | cut -d: -f2 | tr -d '\r')
|
||||
echo " - Redis: $redis_conn clients"
|
||||
}
|
||||
|
||||
# SFTP 會話
|
||||
track_sftp() {
|
||||
echo ""
|
||||
echo "SFTP 會話:"
|
||||
|
||||
# 檢查 SFTPGo 在線用戶
|
||||
if nc -z localhost 2222 2>/dev/null; then
|
||||
echo " - SFTPGo: 檢查中..."
|
||||
fi
|
||||
}
|
||||
|
||||
# 檢測暴力破解
|
||||
detect_bruteforce() {
|
||||
echo ""
|
||||
echo "異常檢測:"
|
||||
|
||||
# 檢查 SSH 暴力破解
|
||||
now=$(date +%s)
|
||||
window=300 # 5 分鐘
|
||||
|
||||
# 統計最近失敗
|
||||
fail_count=$(last -f /var/log/auth.log 2>/dev/null | grep -i "failed" | wc -l)
|
||||
|
||||
if [ $fail_count -gt 10 ]; then
|
||||
echo " ⚠️ 發現潛在暴力破解嘗試: $fail_count 次失敗"
|
||||
record_anomaly "bruteforce" "critical" "unknown" "multiple" "SSH暴力破解: $fail_count 次失敗"
|
||||
else
|
||||
echo " ✓ 無明顯暴力破解跡象"
|
||||
fi
|
||||
}
|
||||
|
||||
# 主程序
|
||||
echo "========================================"
|
||||
echo "Layer 6: User Session Tracking"
|
||||
echo "Time: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
track_ssh
|
||||
track_web
|
||||
track_database
|
||||
track_sftp
|
||||
detect_bruteforce
|
||||
|
||||
echo ""
|
||||
echo "========================================"
|
||||
log "Session tracking completed"
|
||||
265
monitor/workflow/n8n_workflow_monitor.sh
Executable file
265
monitor/workflow/n8n_workflow_monitor.sh
Executable file
@@ -0,0 +1,265 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Momentry n8n Workflow 監控 (Layer 3)
|
||||
# 路徑: /Users/accusys/momentry_core_0.1/monitor/workflow/n8n_workflow_monitor.sh
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
MONITOR_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
LOG_DIR="/Users/accusys/momentry/log/monitor"
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
LOG_FILE="$LOG_DIR/workflow_check.log"
|
||||
|
||||
log() {
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
|
||||
}
|
||||
|
||||
# n8n API 配置
|
||||
N8N_HOST="http://localhost:5678"
|
||||
N8N_API_KEY="eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJlNjdiY2UzOS1iY2RkLTRjMjEtYmMwYy0yODNhYmI3ZjVjMjMiLCJpc3MiOiJuOG4iLCJhdWQiOiJwdWJsaWMtYXBpIiwiaWF0IjoxNzczNjM5ODU4fQ.QOmOju2jLy07GrgXYvylM5AyFINPC06crKEsLLC988I"
|
||||
|
||||
# 從資料庫獲取 workflow
|
||||
fetch_workflows_from_db() {
|
||||
PGPASSWORD=accusys psql -U n8n -h localhost -d n8n -t -A <<'EOF'
|
||||
SELECT json_agg(row_to_json(t)) FROM (
|
||||
SELECT w.id, w.name, w.active, w."createdAt", w."updatedAt",
|
||||
COALESCE(u.email, 'unknown') as owner_email
|
||||
FROM workflow_entity w
|
||||
LEFT JOIN shared_workflow sw ON w.id = sw."workflowId"
|
||||
LEFT JOIN project p ON sw."projectId" = p.id
|
||||
LEFT JOIN "user" u ON p."creatorId" = u.id
|
||||
) t
|
||||
EOF
|
||||
}
|
||||
|
||||
# 記錄 workflow
|
||||
record_workflow() {
|
||||
local wf_id=$1
|
||||
local wf_name=$2
|
||||
local active=$3
|
||||
local last_exec=$4
|
||||
local exec_count=$5
|
||||
local success=$6
|
||||
local failure=$7
|
||||
local avg_duration=$8
|
||||
local has_schedule=$9
|
||||
local has_webhook=${10}
|
||||
local idle_days=${11}
|
||||
local suggestion=${12}
|
||||
|
||||
psql -U accusys -h localhost -d momentry << EOF 2>/dev/null
|
||||
INSERT INTO monitor_workflows
|
||||
(workflow_id, workflow_name, is_active, last_executed_at, execution_count,
|
||||
success_count, failure_count, avg_duration_ms, has_schedule, has_webhook,
|
||||
idle_days, suggestion, checked_at)
|
||||
VALUES
|
||||
('$wf_id', '$wf_name', $active, $last_exec, $exec_count,
|
||||
$success, $failure, $avg_duration, $has_schedule, $has_webhook,
|
||||
$idle_days, '$suggestion', NOW())
|
||||
ON CONFLICT (workflow_id) DO UPDATE SET
|
||||
workflow_name = EXCLUDED.workflow_name,
|
||||
is_active = EXCLUDED.is_active,
|
||||
last_executed_at = EXCLUDED.last_executed_at,
|
||||
execution_count = EXCLUDED.execution_count,
|
||||
success_count = EXCLUDED.success_count,
|
||||
failure_count = EXCLUDED.failure_count,
|
||||
avg_duration_ms = EXCLUDED.avg_duration_ms,
|
||||
has_schedule = EXCLUDED.has_schedule,
|
||||
has_webhook = EXCLUDED.has_webhook,
|
||||
idle_days = EXCLUDED.idle_days,
|
||||
suggestion = EXCLUDED.suggestion,
|
||||
checked_at = NOW();
|
||||
EOF
|
||||
}
|
||||
|
||||
# 獲取 workflow 列表
|
||||
fetch_workflows() {
|
||||
curl -s -H "Accept: application/json" \
|
||||
-H "X-N8N-API-KEY: ${N8N_API_KEY}" \
|
||||
"${N8N_HOST}/rest/workflows" 2>/dev/null || echo "[]"
|
||||
}
|
||||
|
||||
# 獲取 workflow 執行統計
|
||||
fetch_executions() {
|
||||
local wf_id=$1
|
||||
curl -s -H "Accept: application/json" \
|
||||
-H "X-N8N-API-KEY: ${N8N_API_KEY}" \
|
||||
"${N8N_HOST}/rest/executions?workflowId=${wf_id}&limit=50" 2>/dev/null || echo "{\"data\":[]}"
|
||||
}
|
||||
|
||||
# 判斷是否有 schedule
|
||||
has_schedule() {
|
||||
local wf_data=$1
|
||||
echo "$wf_data" | grep -q '"type":"schedule"' && echo "true" || echo "false"
|
||||
}
|
||||
|
||||
# 判斷是否有 webhook
|
||||
has_webhook() {
|
||||
local wf_data=$1
|
||||
echo "$wf_data" | grep -q '"type":"webhook"' && echo "true" || echo "false"
|
||||
}
|
||||
|
||||
# 計算閒置天數
|
||||
calc_idle_days() {
|
||||
local last_exec=$1
|
||||
if [ "$last_exec" = "null" ] || [ -z "$last_exec" ]; then
|
||||
echo "999"
|
||||
else
|
||||
echo "0"
|
||||
fi
|
||||
}
|
||||
|
||||
# 生成建議
|
||||
generate_suggestion() {
|
||||
local has_schedule=$1
|
||||
local has_webhook=$2
|
||||
local idle_days=$3
|
||||
local failure_rate=$4
|
||||
|
||||
if [ "$idle_days" -ge 90 ]; then
|
||||
echo "建議刪除"
|
||||
elif [ "$idle_days" -ge 30 ] && [ "$has_schedule" = "false" ] && [ "$has_webhook" = "false" ]; then
|
||||
echo "建議停用"
|
||||
elif [ "$failure_rate" -gt 20 ]; then
|
||||
echo "建議優化"
|
||||
else
|
||||
echo ""
|
||||
fi
|
||||
}
|
||||
|
||||
# 主程序
|
||||
echo "========================================"
|
||||
echo "Layer 3: n8n Workflow Monitoring"
|
||||
echo "Time: $(date)"
|
||||
echo "========================================"
|
||||
echo ""
|
||||
|
||||
# 檢查 n8n 是否可用 (檢查 PostgreSQL 中的 n8n 資料庫)
|
||||
if ! PGPASSWORD=accusys psql -U n8n -h localhost -d n8n -c "SELECT 1" >/dev/null 2>&1; then
|
||||
echo "n8n 資料庫不可用"
|
||||
log "n8n database unavailable"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 獲取 workflow 列表 (從資料庫)
|
||||
workflows=$(fetch_workflows_from_db)
|
||||
total_count=$(echo "$workflows" | jq 'length' 2>/dev/null || echo "0")
|
||||
active_count=$(echo "$workflows" | jq '[.[] | select(.active == true)] | length' 2>/dev/null || echo "0")
|
||||
|
||||
echo "總 Workflow: $total_count"
|
||||
echo "啟用中: $active_count"
|
||||
echo ""
|
||||
|
||||
# 閒置閾值
|
||||
IDLE_THRESHOLD=30
|
||||
|
||||
echo "Workflow 詳細:"
|
||||
echo "----------------------------------------"
|
||||
|
||||
total_idle=0
|
||||
|
||||
for wf in $(echo "$workflows" | jq -r '.[] | @base64' 2>/dev/null); do
|
||||
wf_decoded=$(echo "$wf" | base64 -d)
|
||||
|
||||
wf_id=$(echo "$wf_decoded" | jq -r '.id' 2>/dev/null)
|
||||
wf_name=$(echo "$wf_decoded" | jq -r '.name' 2>/dev/null)
|
||||
is_active=$(echo "$wf_decoded" | jq -r '.active' 2>/dev/null)
|
||||
wf_owner=$(echo "$wf_decoded" | jq -r '.owner_email' 2>/dev/null)
|
||||
|
||||
# 從資料庫獲取執行數據
|
||||
exec_data=$(PGPASSWORD=accusys psql -U n8n -h localhost -d n8n -t -A <<EOF
|
||||
SELECT json_agg(row_to_json(t)) FROM (
|
||||
SELECT status, "startedAt", "stoppedAt",
|
||||
EXTRACT(EPOCH FROM ("stoppedAt" - "startedAt")) * 1000 as execution_time
|
||||
FROM execution_entity
|
||||
WHERE "workflowId" = '$wf_id'
|
||||
ORDER BY "startedAt" DESC
|
||||
LIMIT 50
|
||||
) t
|
||||
EOF
|
||||
)
|
||||
|
||||
exec_count=$(echo "$exec_data" | jq '. | length' 2>/dev/null || echo "0")
|
||||
|
||||
# 計算成功/失敗
|
||||
success_count=$(echo "$exec_data" | jq '[.[] | select(.status == "success")] | length' 2>/dev/null || echo "0")
|
||||
failure_count=$(echo "$exec_data" | jq '[.[] | select(.status == "error")] | length' 2>/dev/null || echo "0")
|
||||
|
||||
# 平均執行時間
|
||||
avg_duration=$(echo "$exec_data" | jq '[.[] | .execution_time] | map(select(. != null)) | add / length | floor' 2>/dev/null || echo "0")
|
||||
|
||||
# 檢查是否有 webhook
|
||||
has_webh=$(PGPASSWORD=accusys psql -U n8n -h localhost -d n8n -t -A -c "
|
||||
SELECT COUNT(*) FROM webhook_entity WHERE workflow_id = '$wf_id'
|
||||
" 2>/dev/null || echo "0")
|
||||
[ "$has_webh" -gt 0 ] && has_webh="true" || has_webh="false"
|
||||
has_sched="false"
|
||||
|
||||
# 最後執行時間
|
||||
last_exec=$(echo "$exec_data" | jq -r '.[0].startedAt // "null"' 2>/dev/null | head -1)
|
||||
if [ "$last_exec" = "null" ] || [ -z "$last_exec" ]; then
|
||||
idle_days=999
|
||||
else
|
||||
idle_days=0
|
||||
fi
|
||||
|
||||
# 確保數值正確
|
||||
exec_count=$(echo "$exec_count" | tr -d '[:space:]' || echo "0")
|
||||
success_count=$(echo "$success_count" | tr -d '[:space:]' || echo "0")
|
||||
failure_count=$(echo "$failure_count" | tr -d '[:space:]' || echo "0")
|
||||
avg_duration=$(echo "$avg_duration" | tr -d '[:space:]' || echo "0")
|
||||
|
||||
# 計算失敗率
|
||||
if [ -n "$exec_count" ] && [ "$exec_count" -gt 0 ] 2>/dev/null; then
|
||||
failure_rate=$(( failure_count * 100 / exec_count ))
|
||||
else
|
||||
failure_rate=0
|
||||
fi
|
||||
|
||||
# 生成建議
|
||||
suggestion=$(generate_suggestion "$has_sched" "$has_webh" "$idle_days" "$failure_rate")
|
||||
|
||||
# 記錄到資料庫
|
||||
if [ "$last_exec" = "null" ] || [ -z "$last_exec" ]; then
|
||||
record_workflow "$wf_id" "$wf_name" "$is_active" "NULL" "$exec_count" "$success_count" "$failure_count" "$avg_duration" "$has_sched" "$has_webh" "$idle_days" "$suggestion"
|
||||
else
|
||||
record_workflow "$wf_id" "$wf_name" "$is_active" "'$last_exec'" "$exec_count" "$success_count" "$failure_count" "$avg_duration" "$has_sched" "$has_webh" "$idle_days" "$suggestion"
|
||||
fi
|
||||
|
||||
# 顯示
|
||||
status_icon="○"
|
||||
if [ "$is_active" = "true" ]; then
|
||||
status_icon="●"
|
||||
fi
|
||||
|
||||
idle_info=""
|
||||
if [ "$idle_days" -ge "$IDLE_THRESHOLD" ]; then
|
||||
idle_info=" [閒置 $idle_days 天]"
|
||||
total_idle=$((total_idle + 1))
|
||||
fi
|
||||
|
||||
suggestion_info=""
|
||||
if [ -n "$suggestion" ]; then
|
||||
suggestion_info=" [$suggestion]"
|
||||
fi
|
||||
|
||||
echo "$status_icon $wf_name (ID: $wf_id) [$wf_owner]$idle_info$suggestion_info"
|
||||
echo " 執行: $exec_count (成功: $success_count, 失敗: $failure_count) | 平均: ${avg_duration}ms"
|
||||
done
|
||||
|
||||
echo "----------------------------------------"
|
||||
echo "閒置 Workflow (> $IDLE_THRESHOLD 天): $total_idle"
|
||||
echo ""
|
||||
|
||||
log "Workflow check completed: $total_count total, $total_idle idle"
|
||||
|
||||
# 顯示閒置 workflow
|
||||
if [ $total_idle -gt 0 ]; then
|
||||
echo ""
|
||||
echo "閒置 Workflow 建議:"
|
||||
psql -U accusys -h localhost -d momentry -t -A -c "
|
||||
SELECT ' - ' || workflow_name || ': ' || suggestion
|
||||
FROM monitor_workflows
|
||||
WHERE idle_days >= $IDLE_THRESHOLD AND suggestion != '';
|
||||
" 2>/dev/null
|
||||
fi
|
||||
Reference in New Issue
Block a user