Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# 기본: 모든 텍스트 파일은 LF 정규화
* text=auto eol=lf

# 예외: HTML/CSS/JS는 라인엔드 변환 제외
*.html -text
*.css -text
*.js -text

# 공통 바이너리
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.webp binary
*.ico binary
*.pdf binary
*.zip binary
*.tar binary
*.gz binary
*.bz2 binary
*.xz binary

30 changes: 29 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,29 @@
# Systools
# systools

## Install

```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

## Run

```bash
# redis
python monitor.py --target redis --redis-url redis://localhost:6379/0 --interval 5 --output pretty

# linux
python monitor.py --target linux --interval 5 --output json
```

- `--target`: redis | linux | kafka | jvm
- `--redis-url`: Redis 연결 URL (예: `redis://:password@host:6379/0`)
- `--interval`: 수집 주기(초), 0이면 1회만 수집
- `--output`: pretty | json

use config:
```bash
python monitor.py --target redis --config config.yaml
```
2 changes: 1 addition & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: "3.8"
gversion: "3.8"

services:
redis:
Expand Down
2 changes: 0 additions & 2 deletions docs/jvm/.gitkeep

This file was deleted.

50 changes: 50 additions & 0 deletions docs/jvm/metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# JVM 모니터링 지표(24선, 구조 확정)

현재 문서는 JVM 메트릭 24개 항목의 구조/의미를 정의합니다. 실제 값은 추후 JMX 연동(jolokia/pyjmx 등)으로 채웁니다.

## Memory
1) heap_used_bytes
2) heap_committed_bytes
3) heap_max_bytes
4) non_heap_used_bytes
5) non_heap_committed_bytes
6) metaspace_used_bytes
7) metaspace_committed_bytes
- 의미: JVM 메모리 사용/커밋/최대, 메타스페이스 사용량

## GC
8) young_gc_count
9) young_gc_time_ms
10) old_gc_count
11) old_gc_time_ms
- 의미: Young/Old(G1/Parallel/CMS 등) 컬렉션 횟수/시간

## Threads
12) thread_count
13) daemon_thread_count
14) peak_thread_count
15) deadlocked_thread_count
- 의미: 스레드 수/피크/교착상태 스레드 수

## ClassLoading
16) loaded_class_count
17) total_loaded_class_count
18) unloaded_class_count
- 의미: 클래스 로딩/언로딩 통계

## CPU
19) process_cpu_load
20) system_cpu_load
21) process_cpu_time_ns
- 의미: 프로세스/시스템 CPU 부하 및 프로세스 CPU 누적 시간

## Runtime
22) uptime_ms
23) compiler_total_time_ms
24) safepoint_count
- 의미: JVM 업타임, JIT 컴파일 누적, 세이프포인트 진입 횟수

주의
- HotSpot 기반 MXBean/JFR/Jolokia에서 제공되는 표준/벤더 지표를 우선 사용합니다.
- 실제 수집은 JMX URL, 인증, SSL 등 연결 설정이 필요합니다.

2 changes: 0 additions & 2 deletions docs/kafka/.gitkeep

This file was deleted.

30 changes: 30 additions & 0 deletions docs/kafka/metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Kafka 모니터링 지표(초안)

본 모듈은 `kafka-python`을 이용해 경량 메트릭을 수집합니다.

## 수집 항목
- broker
- num_brokers: 메타데이터 상 브로커 수(추정)
- topics
- num_topics: 토픽 수
- num_partitions: 전체 파티션 수 합
- lag (옵션)
- consumer_group_id: CLI로 전달된 그룹 ID
- consumer_lag_total: 각 파티션의 end_offset - committed_offset 합(음수는 0 처리)
- meta
- timestamp, bootstrap_servers

주의
- 그룹 랙 계산은 `--kafka-group` 제공 시에만 작동합니다.
- 브로커 수/메타데이터는 클라이언트 내부 메타데이터 기반이므로 일시적으로 부정확할 수 있습니다.
- Throughput(초당 in/out 바이트)은 브로커 JMX/관리 API 연동 시 확장 예정입니다.

## 실행 예시
```bash
# 토픽/파티션/브로커 수만
python monitor.py --target kafka --kafka-bootstrap localhost:9092 --output json

# 그룹 랙 포함
python monitor.py --target kafka --kafka-bootstrap localhost:9092 --kafka-group my-consumer --output json
```

2 changes: 0 additions & 2 deletions docs/linux/.gitkeep

This file was deleted.

35 changes: 35 additions & 0 deletions docs/linux/metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Linux 모니터링 지표

리눅스 호스트의 핵심 시스템 지표를 `/proc` 기반으로 경량 수집합니다.

## System
- uptime_seconds: 호스트 업타임(초) — `/proc/uptime`
- cpu_count: 논리 코어 수
- loadavg_1/5/15: 시스템 로드 평균 — `os.getloadavg()`
- cpu_usage_percent: 짧은 샘플(기본 200ms)로 계산한 CPU 사용률 — `/proc/stat`
- process_count: 현재 프로세스 개수 — `/proc` 디렉터리 내 PID 개수

## Memory
- mem_total_kb, mem_available_kb — `/proc/meminfo`
- mem_used_kb, mem_used_percent
- swap_total_kb, swap_free_kb, swap_used_kb, swap_used_percent

## Disk
- 마운트별 사용량(물리 FS 위주: ext2/3/4, xfs, btrfs, zfs)
- {mount}.total_bytes, used_bytes, free_bytes, used_percent — `shutil.disk_usage`

## Network
- rx_bytes_total, tx_bytes_total — `/proc/net/dev` 합계
- interfaces: 인터페이스별 {rx_bytes, rx_packets, tx_bytes, tx_packets}

## File system
- fd_allocated, fd_max — `/proc/sys/fs/file-nr`

## Meta
- timestamp, node(hostname), kernel(uname 전체 문자열)

주의
- `/proc` 의존으로 일부 컨테이너/보안 환경에서 접근이 제한될 수 있습니다.
- CPU%는 짧은 샘플링 기반으로 약간의 지연(기본 ~200ms)을 유발합니다.


2 changes: 0 additions & 2 deletions docs/redis/.gitkeep

This file was deleted.

2 changes: 0 additions & 2 deletions jvm/.gitkeep

This file was deleted.

62 changes: 62 additions & 0 deletions jvm/collector.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
import time
from typing import Any, Dict


class JvmMetricsCollector:
def __init__(self, jmx_url: str | None = None):
# 실제 구현은 JMX 연동(jolokia/pyjmx 등)을 통해 값 채움
self.jmx_url = jmx_url

def collect_all(self) -> Dict[str, Dict[str, Any]]:
# 24개 대표 메트릭 필드(기본 None) - 구조 확정
memory = {
"heap_used_bytes": None, # 1
"heap_committed_bytes": None, # 2
"heap_max_bytes": None, # 3
"non_heap_used_bytes": None, # 4
"non_heap_committed_bytes": None, # 5
"metaspace_used_bytes": None, # 6
"metaspace_committed_bytes": None, # 7
}
gc = {
"young_gc_count": None, # 8
"young_gc_time_ms": None, # 9
"old_gc_count": None, # 10
"old_gc_time_ms": None, # 11
}
threads = {
"thread_count": None, # 12
"daemon_thread_count": None, # 13
"peak_thread_count": None, # 14
"deadlocked_thread_count": None, # 15
}
classloading = {
"loaded_class_count": None, # 16
"total_loaded_class_count": None, # 17
"unloaded_class_count": None, # 18
}
cpu = {
"process_cpu_load": None, # 19
"system_cpu_load": None, # 20
"process_cpu_time_ns": None, # 21
}
runtime = {
"uptime_ms": None, # 22
"compiler_total_time_ms": None, # 23
"safepoint_count": None, # 24
}
return {
"memory": memory,
"gc": gc,
"threads": threads,
"classloading": classloading,
"cpu": cpu,
"runtime": runtime,
"meta": {
"timestamp": int(time.time()),
"jmx_url": self.jmx_url,
"not_implemented": True,
},
}


2 changes: 0 additions & 2 deletions kafka/.gitkeep

This file was deleted.

116 changes: 116 additions & 0 deletions kafka/collector.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
import time
from typing import Any, Dict, List

try:
from kafka import KafkaConsumer, TopicPartition
from kafka.errors import KafkaError
except Exception: # pragma: no cover
KafkaConsumer = None
TopicPartition = None
KafkaError = Exception


class KafkaMetricsCollector:
def __init__(self, bootstrap_servers: str | None = None, group_id: str | None = None, timeout_ms: int = 3000):
self.bootstrap_servers = bootstrap_servers
self.group_id = group_id
self.timeout_ms = timeout_ms

def _build_consumer(self) -> KafkaConsumer | None:
if KafkaConsumer is None or not self.bootstrap_servers:
return None
try:
consumer = KafkaConsumer(
bootstrap_servers=self.bootstrap_servers,
group_id=self.group_id if self.group_id else None,
client_id="systools-kafka",
enable_auto_commit=False,
consumer_timeout_ms=self.timeout_ms,
request_timeout_ms=max(self.timeout_ms, 5000),
metadata_max_age_ms=5000,
api_version_auto_timeout_ms=3000,
)
return consumer
except Exception:
return None

def _compute_topics_partitions(self, consumer: KafkaConsumer) -> Dict[str, int]:
num_topics = 0
num_partitions = 0
try:
topics = consumer.topics()
num_topics = len(topics or [])
for t in topics or []:
parts = consumer.partitions_for_topic(t)
if parts:
num_partitions += len(parts)
except Exception:
pass
return {"num_topics": num_topics, "num_partitions": num_partitions}

def _num_brokers(self, consumer: KafkaConsumer) -> int | None:
try:
cluster = consumer._client.cluster # 내부 속성 사용(없으면 None)
if cluster:
return len(cluster.brokers())
except Exception:
return None
return None

def _group_lag(self, consumer: KafkaConsumer) -> int | None:
if not self.group_id:
return None
try:
topics = list(consumer.topics() or [])
tps: List[TopicPartition] = []
for t in topics:
parts = consumer.partitions_for_topic(t) or []
for p in parts:
tps.append(TopicPartition(t, p))
if not tps:
return 0
end_offsets = consumer.end_offsets(tps)
total_lag = 0
for tp in tps:
committed = consumer.committed(tp)
end = end_offsets.get(tp, None)
if committed is None or end is None:
continue
lag = max(end - committed, 0)
total_lag += lag
return total_lag
except Exception:
return None

def collect_all(self) -> Dict[str, Dict[str, Any]]:
consumer = self._build_consumer()

num_brokers = None
topics_info = {"num_topics": None, "num_partitions": None}
group_lag_total = None

if consumer:
num_brokers = self._num_brokers(consumer)
topics_info = self._compute_topics_partitions(consumer)
group_lag_total = self._group_lag(consumer)
try:
consumer.close()
except Exception:
pass

return {
"broker": {
"num_brokers": num_brokers,
},
"topics": topics_info,
"lag": {
"consumer_group_id": self.group_id,
"consumer_lag_total": group_lag_total,
},
"meta": {
"timestamp": int(time.time()),
"bootstrap_servers": self.bootstrap_servers,
},
}


2 changes: 0 additions & 2 deletions linux/.gitkeep

This file was deleted.

Loading