Skip to content

Commit 1c68478

Browse files
committed
Update README.md
1 parent 1768ee1 commit 1c68478

File tree

1 file changed

+90
-68
lines changed

1 file changed

+90
-68
lines changed

README.md

Lines changed: 90 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,8 @@ Apache HugeGraph-Computer is a comprehensive graph computing solution providing
1515

1616
| Feature | Vermeer (Go) | Computer (Java) |
1717
|---------|--------------|-----------------|
18-
| **Best for** | Single machine, quick start | Large-scale distributed computing |
19-
| **Deployment** | Single binary | Kubernetes or YARN cluster |
18+
| **Best for** | Quick start, flexible deployment | Large-scale distributed computing |
19+
| **Deployment** | Single binary, multi-node capable | Kubernetes or YARN cluster |
2020
| **Memory model** | In-memory first | Auto spill to disk |
2121
| **Setup time** | Minutes | Hours (requires K8s/YARN) |
2222
| **Algorithms** | 20+ algorithms | 45+ algorithms |
@@ -49,6 +49,65 @@ graph TB
4949
style Computer fill:#fff3e0
5050
```
5151

52+
## Vermeer Architecture (In-Memory Engine)
53+
54+
Vermeer is designed with a Master-Worker architecture optimized for high-performance in-memory graph computing:
55+
56+
```mermaid
57+
graph TB
58+
subgraph Client["Client Layer"]
59+
API[REST API Client]
60+
UI[Web UI Dashboard]
61+
end
62+
63+
subgraph Master["Master Node :6688"]
64+
HTTP[HTTP Server]
65+
GRPC_M[gRPC Server :6689]
66+
GM[Graph Manager]
67+
TM[Task Manager]
68+
WM[Worker Manager]
69+
end
70+
71+
subgraph Workers["Worker Nodes"]
72+
W1[Worker 1 :6789]
73+
W2[Worker 2 :6789]
74+
W3[Worker N :6789]
75+
end
76+
77+
subgraph DataSources["Data Sources"]
78+
HG[(HugeGraph)]
79+
CSV[Local CSV]
80+
HDFS[HDFS]
81+
end
82+
83+
API --> HTTP
84+
UI --> HTTP
85+
GRPC_M <--> W1
86+
GRPC_M <--> W2
87+
GRPC_M <--> W3
88+
89+
W1 -.-> HG
90+
W2 -.-> HG
91+
W3 -.-> HG
92+
W1 -.-> CSV
93+
W1 -.-> HDFS
94+
95+
style Master fill:#e1f5fe
96+
style Workers fill:#f3e5f5
97+
style DataSources fill:#fff9c4
98+
```
99+
100+
**Component Overview:**
101+
102+
| Component | Description |
103+
|-----------|-------------|
104+
| **Master** | Coordinates workers, manages graph metadata, schedules computation tasks via HTTP (:6688) and gRPC (:6689) |
105+
| **Workers** | Execute graph algorithms, store graph partition data in memory, communicate via gRPC (:6789) |
106+
| **REST API** | Graph loading, algorithm execution, result queries (port 6688) |
107+
| **Web UI** | Built-in monitoring dashboard accessible at `/ui/` |
108+
| **Data Sources** | Supports loading from HugeGraph (via gRPC), local CSV files, and HDFS |
109+
110+
52111
## HugeGraph Ecosystem Integration
53112

54113
```
@@ -107,91 +166,54 @@ See the **[Vermeer README](./vermeer/README.md)** for detailed configuration and
107166

108167
## Getting Started with Computer (Distributed)
109168

110-
For large-scale distributed graph processing across clusters:
111-
112-
### Prerequisites
169+
For large-scale distributed graph processing on Kubernetes or YARN clusters, see the **[Computer README](./computer/README.md)** for:
113170

114-
- JDK 11 or later
115-
- Maven 3.5+
116-
- Kubernetes cluster or YARN cluster
117-
118-
### Build from Source
119-
120-
```bash
121-
cd computer
122-
mvn clean package -DskipTests
123-
```
124-
125-
### Deploy on Kubernetes
126-
127-
```bash
128-
# Configure your K8s cluster in computer-k8s module
129-
# Submit a graph computing job
130-
java -jar computer-driver.jar --config job-config.properties
131-
```
132-
133-
See the **[Computer README](./computer/README.md)** for detailed deployment and development guide.
171+
- Prerequisites and build instructions
172+
- Kubernetes/YARN deployment guide
173+
- 45+ algorithm implementations
174+
- Custom algorithm development framework
134175

135176
## Supported Algorithms
136177

137-
### Common Algorithms (Both Systems)
178+
### Vermeer Algorithms (20+)
138179

139180
| Category | Algorithms |
140181
|----------|-----------|
141-
| **Centrality** | PageRank, Personalized PageRank, Betweenness Centrality, Closeness Centrality, Degree Centrality |
142-
| **Community Detection** | Louvain, LPA (Label Propagation), SLPA, WCC (Weakly Connected Components) |
143-
| **Path Finding** | SSSP (Single Source Shortest Path), BFS (Breadth-First Search) |
144-
| **Graph Structure** | Triangle Count, K-Core, Clustering Coefficient, Cycle Detection |
182+
| **Centrality** | PageRank, Personalized PageRank, Betweenness, Closeness, Degree |
183+
| **Community** | Louvain, Weighted Louvain, LPA, SLPA, WCC, SCC |
184+
| **Path Finding** | SSSP (Dijkstra), BFS Depth |
185+
| **Structure** | Triangle Count, K-Core, K-Out, Clustering Coefficient, Cycle Detection |
145186
| **Similarity** | Jaccard Similarity |
146187

147-
### Vermeer-Specific Features
148-
188+
**Features:**
149189
- In-memory optimized implementations
150-
- Weighted Louvain variant
151190
- REST API for algorithm execution
191+
- Real-time result queries
152192

153-
### Computer-Specific Algorithms
154-
155-
- Count Triangle (distributed implementation)
156-
- Rings detection
157-
- ClusteringCoefficient variations
158-
- Custom algorithm development framework
159-
160-
See individual README files for complete algorithm lists and usage examples.
161-
162-
## Performance Characteristics
163-
164-
### Vermeer (In-Memory)
165-
166-
- **Throughput**: Optimized for fast iteration on medium-sized graphs (millions of vertices/edges)
167-
- **Latency**: Sub-second query response via REST API
168-
- **Memory**: Requires graph to fit in total worker memory
169-
- **Scalability**: Horizontal scaling by adding worker nodes
193+
> **Computer (Java) Algorithms**: For Computer's 45+ algorithm implementations including distributed Triangle Count, Rings detection, and custom algorithm development framework, see [Computer Algorithm List](./computer/README.md#available-algorithms).
170194
171-
### Computer (Distributed BSP)
195+
## When to Use Which
172196

173-
- **Throughput**: Handles billions of vertices/edges via distributed processing
174-
- **Latency**: Batch-oriented with superstep barriers
175-
- **Memory**: Auto spill to disk when memory is insufficient
176-
- **Scalability**: Elastic scaling on K8s with pod autoscaling
197+
### Choose Vermeer when:
177198

178-
## Use Cases
199+
- ✅ Quick prototyping and experimentation
200+
- ✅ Interactive analytics with built-in Web UI
201+
- ✅ Graphs up to hundreds of millions of edges
202+
- ✅ REST API integration requirements
203+
- ✅ Single machine or small cluster with high-memory nodes
204+
- ✅ Sub-second query response requirements
179205

180-
### When to Use Vermeer
206+
**Performance**: Optimized for fast iteration on medium-sized graphs with in-memory processing. Horizontal scaling by adding worker nodes.
181207

182-
- Quick prototyping and experimentation
183-
- Interactive graph analytics with Web UI
184-
- Medium-scale graphs (up to hundreds of millions of edges)
185-
- Single-machine or small cluster deployments
186-
- REST API integration requirements
208+
### Choose Computer when:
187209

188-
### When to Use Computer
210+
- ✅ Billions of vertices/edges requiring distributed processing
211+
- ✅ Existing Kubernetes or YARN infrastructure
212+
- ✅ Custom algorithm development with Java
213+
- ✅ Memory-constrained environments (auto disk spill)
214+
- ✅ Integration with Hadoop ecosystem
189215

190-
- Large-scale batch processing (billions of vertices)
191-
- Existing Kubernetes or YARN infrastructure
192-
- Custom algorithm development with Java
193-
- Memory-constrained environments (auto spill to disk)
194-
- Integration with Hadoop ecosystem
216+
**Performance**: Handles massive graphs via distributed BSP framework. Batch-oriented with superstep barriers. Elastic scaling on K8s.
195217

196218
## Documentation
197219

0 commit comments

Comments
 (0)