Skip to content

Commit 18d63cd

Browse files
authored
Fix up README and docs homepage (#137)
* Fix up README and docs homepage * Remove remaining broken link
1 parent 85cc566 commit 18d63cd

File tree

15 files changed

+159
-272
lines changed

15 files changed

+159
-272
lines changed

README.md

Lines changed: 28 additions & 153 deletions
Original file line numberDiff line numberDiff line change
@@ -3,171 +3,67 @@
33
# everyrow SDK
44

55
[![PyPI version](https://img.shields.io/pypi/v/everyrow.svg)](https://pypi.org/project/everyrow/)
6-
[![Claude Code](https://img.shields.io/badge/Claude_Code-plugin-D97757?logo=claude&logoColor=fff)](#claude-code-plugin)
6+
[![Claude Code](https://img.shields.io/badge/Claude_Code-plugin-D97757?logo=claude&logoColor=fff)](#claude-code)
77
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
88
[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
99

10-
Run LLM research agents at scale. Use them to intelligently sort, filter, merge, dedupe, or add columns to pandas dataframes. See the [docs site](https://everyrow.io/docs). Scales to tens of thousands of LLM agents on tens of thousands of rows.
10+
Run LLM research agents at scale. Use them to intelligently sort, filter, merge, dedupe, or add columns to pandas dataframes. Scales to tens of thousands of LLM agents on tens of thousands of rows, all from a single python method. See the [docs site](https://everyrow.io/docs).
1111

1212
```bash
1313
pip install everyrow
1414
```
1515

16+
The best experience is inside Claude Code.
1617
```bash
1718
claude plugin marketplace add futuresearch/everyrow-sdk
1819
claude plugin install everyrow@futuresearch
1920
```
2021

21-
[Get] an API key at [everyrow.io/api-key](https://everyrow.io/api-key) ($20 free credit), then:
22+
Get an API key at [everyrow.io/api-key](https://everyrow.io/api-key) ($20 free credit), then:
2223

2324
```python
2425
import asyncio
2526
import pandas as pd
2627
from everyrow.ops import screen
2728
from pydantic import BaseModel, Field
2829

29-
jobs = pd.DataFrame([
30-
{"company": "Airtable", "post": "Async-first team, 8+ yrs exp, $185-220K base"},
31-
{"company": "Vercel", "post": "Lead our NYC team. Competitive comp, DOE"},
32-
{"company": "Notion", "post": "In-office SF. Staff eng, $200K + equity"},
33-
{"company": "Linear", "post": "Bootcamp grads welcome! $85K, remote-friendly"},
34-
{"company": "Descript", "post": "Work from anywhere. Principal architect, $250K"},
35-
{"company": "Retool", "post": "Flexible location. Building infra. Comp TBD"},
30+
companies = pd.DataFrame([
31+
{"company": "Airtable",}, {"company": "Vercel",}, {"company": "Notion",}
3632
])
3733

3834
class JobScreenResult(BaseModel):
39-
qualifies: bool = Field(description="True if meets ALL criteria")
35+
qualifies: bool = Field(description="True if company lists jobs with all criteria")
4036

4137
async def main():
4238
result = await screen(
43-
task="""
44-
Qualifies if ALL THREE are met:
45-
1. Remote-friendly (allows remote, hybrid, or distributed)
46-
2. Senior-level (5+ yrs exp OR title includes Senior/Staff/Principal)
47-
3. Salary disclosed (specific numbers like "$150K", not "competitive" or "DOE")
48-
""",
49-
input=jobs,
39+
task="""Qualifies if: 1. Remote-friendly, 2. Senior, and 3. Discloses salary""",
40+
input=companies,
5041
response_model=JobScreenResult,
5142
)
52-
print(result.data.head()) # Airtable, Descript pass. Others fail one or more.
43+
print(result.data.head())
5344

5445
asyncio.run(main())
5546
```
5647

57-
```bash
58-
export EVERYROW_API_KEY=your_key_here
59-
python example.py
60-
```
61-
6248
## Operations
6349

64-
| | |
65-
|---|---|
66-
| [**Screen**](#screen) | Filter by criteria that need judgment |
67-
| [**Rank**](#rank) | Score rows from research |
68-
| [**Dedupe**](#dedupe) | Deduplicate when fuzzy matching fails |
69-
| [**Merge**](#merge) | Join tables when keys don't match |
70-
| [**Research**](#agent-tasks) | Web research on every row |
71-
| [**Derive**](#derive) | Add computed columns |
72-
73-
---
74-
75-
## Screen
76-
77-
Filter rows based on criteria you can't put in a WHERE clause.
78-
79-
```python
80-
from everyrow.ops import screen
81-
from pydantic import BaseModel, Field
82-
83-
class ScreenResult(BaseModel):
84-
passes: bool = Field(description="True if meets the criteria")
85-
86-
result = await screen(
87-
task="""
88-
Qualifies if ALL THREE are met:
89-
1. Remote-friendly (allows remote, hybrid, or distributed)
90-
2. Senior-level (5+ yrs exp OR title includes Senior/Staff/Principal)
91-
3. Salary disclosed (specific numbers, not "competitive" or "DOE")
92-
""",
93-
input=job_postings,
94-
response_model=ScreenResult,
95-
)
96-
print(result.data.head())
97-
```
98-
99-
**More:** [docs](docs/SCREEN.md) / [basic usage](docs/case_studies/basic-usage/notebook.ipynb) / [job posting screen](https://futuresearch.ai/job-posting-screening/) (>90% precision vs 68% regex) / [stock screen](https://futuresearch.ai/thematic-stock-screening/) ([notebook](docs/case_studies/screen-stocks-by-investment-thesis/notebook.ipynb))
100-
101-
---
102-
103-
## Rank
104-
105-
Score rows by researching them on the web.
106-
107-
```python
108-
from everyrow.ops import rank
109-
110-
result = await rank(
111-
task="Score by likelihood to need data integration solutions",
112-
input=leads_dataframe,
113-
field_name="integration_need_score",
114-
)
115-
print(result.data.head())
116-
```
117-
118-
**More:** [docs](docs/RANK.md) / [basic usage](docs/case_studies/basic-usage/notebook.ipynb) / [lead scoring](https://futuresearch.ai/lead-scoring-data-fragmentation/) (1,000 leads, $13) / [vs Clay](https://futuresearch.ai/lead-scoring-without-crm/) ($28 vs $145)
119-
120-
---
121-
122-
## Dedupe
50+
Intelligent data processing can handle tens of thousands of LLM calls, or thousands of LLM web research agents, in each single operation.
12351

124-
Deduplicate when fuzzy matching falls short.
52+
| Operation | Intelligence | Scales To |
53+
|---|---|---|
54+
| [**Screen**](https://everyrow.io/docs/reference/SCREEN) | Filter by criteria that need judgment | 10k rows |
55+
| [**Rank**](https://everyrow.io/docs/reference/RANK) | Score rows from research | 10k rows |
56+
| [**Dedupe**](https://everyrow.io/docs/reference/DEDUPE) | Deduplicate when fuzzy matching fails | 20k rows |
57+
| [**Merge**](https://everyrow.io/docs/reference/MERGE) | Join tables when keys don't match | 5k rows |
58+
| [**Research**](https://everyrow.io/docs/reference/RESEARCH) | Web research on every row | 10k rows |
12559

126-
```python
127-
from everyrow.ops import dedupe
128-
129-
result = await dedupe(
130-
input=contacts,
131-
equivalence_relation="""
132-
Two rows are duplicates if they represent the same person.
133-
Account for name abbreviations, typos, and career changes.
134-
""",
135-
)
136-
print(result.data.head())
137-
```
138-
139-
"A. Butoi" and "Alexandra Butoi" are the same person. "AUTON Lab (Former)" indicates a career change, not a different org. Results include `equivalence_class_id`, `equivalence_class_name`, and `selected` (the canonical record).
140-
141-
**More:** [docs](docs/DEDUPE.md) / [basic usage](docs/case_studies/basic-usage/notebook.ipynb) / [CRM dedupe](https://futuresearch.ai/crm-deduplication/) (500→124 rows, $1.67, [notebook](docs/case_studies/dedupe-crm-company-records/notebook.ipynb)) / [researcher dedupe](https://futuresearch.ai/researcher-dedupe-case-study/) (98% accuracy)
142-
143-
---
144-
145-
## Merge
146-
147-
Join two tables when the keys don't match exactly. Or at all.
148-
149-
```python
150-
from everyrow.ops import merge
151-
152-
result = await merge(
153-
task="Match each software product to its parent company",
154-
left_table=software_products,
155-
right_table=approved_suppliers,
156-
merge_on_left="software_name",
157-
merge_on_right="company_name",
158-
)
159-
print(result.data.head())
160-
```
161-
162-
Knows that Photoshop belongs to Adobe and Genentech is a Roche subsidiary, even with zero string similarity. Fuzzy matching thresholds always fail somewhere: 0.9 misses "Colfi" ↔ "Dr. Ioana Colfescu", 0.7 false-positives on "John Smith" ↔ "Jane Smith".
163-
164-
**More:** [docs](docs/MERGE.md) / [basic usage](docs/case_studies/basic-usage/notebook.ipynb) / [supplier matching](https://futuresearch.ai/software-supplier-matching/) (2,000 products, 91% accuracy) / [HubSpot merge](https://futuresearch.ai/merge-hubspot-contacts/) (99.9% recall)
60+
See the full [API reference](https://everyrow.io/docs/api), [guides](https://everyrow.io/docs/guides), and [notebooks](https://everyrow.io/docs/notebooks), (for example, see our [notebook](https://everyrow.io/docs/notebooks/llm-web-research-agents-at-scale) running a `Research` task on 10k rows, running agents that used 120k LLM calls.)
16561

16662
---
16763

168-
## Agent Tasks
64+
## Web Agents
16965

170-
Web research on single inputs or entire dataframes. Agents are tuned on [Deep Research Bench](https://arxiv.org/abs/2506.06287), our benchmark for questions that need extensive searching and cross-referencing.
66+
The most basic utility to build from is `agent_map`, to have LLM web research agents work on every row of the dataframe. Agents are tuned on [Deep Research Bench](https://arxiv.org/abs/2506.06287), our benchmark for questions that need extensive searching and cross-referencing, and tuned to get correct answers at minimal cost.
17167

17268
```python
17369
from everyrow.ops import single_agent, agent_map
@@ -177,14 +73,14 @@ from pydantic import BaseModel
17773
class CompanyInput(BaseModel):
17874
company: str
17975

180-
# Single input
76+
# Single input, run one web research agent
18177
result = await single_agent(
18278
task="Find this company's latest funding round and lead investors",
18379
input=CompanyInput(company="Anthropic"),
18480
)
18581
print(result.data.head())
18682

187-
# Batch
83+
# Map input, run a set of web research agents in parallel
18884
result = await agent_map(
18985
task="Find this company's latest funding round and lead investors",
19086
input=DataFrame([
@@ -196,43 +92,20 @@ result = await agent_map(
19692
print(result.data.head())
19793
```
19894

199-
**More:** [docs](docs/reference/RESEARCH.md) / [basic usage](docs/case_studies/basic-usage/notebook.ipynb)
200-
201-
### Derive
202-
203-
Add computed columns using [`pandas.DataFrame.eval`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.eval.html#pandas.DataFrame.eval), no AI agents needed.
95+
See the API [docs](https://everyrow.io/docs/reference/RESEARCH.md), a case study of [labeling data](https://everyrow.io/docs/classify-dataframe-rows-llm) or a notebook for [researching government data](https://everyrow.io/docs/notebooks/research-and-rank-permit-times) at scale.
20496

205-
```python
206-
from everyrow.ops import derive
20797

208-
result = await derive(
209-
input=orders_dataframe,
210-
expressions={"total": "price * quantity"},
211-
)
212-
print(result.data.head())
213-
```
98+
## Sessions
21499

215-
`derive` is useful for adding simple calculated fields before or after other operations. It's much faster and cheaper than using AI agents to do the computation.
216-
217-
**More:** [basic usage](docs/case_studies/basic-usage/notebook.ipynb)
218-
219-
220-
## Advanced
221-
222-
### Sessions
223-
224-
Sessions are created automatically for one-off operations. For multiple operations, use an explicit session:
100+
You can also use a session to output a URL to see the research and data processing in the [everyrow.io/app](https://everyrow.io/app) application, which streams the research and makes charts. Or you can use it purely as a data utility, and [chain intelligent pandas operations](https://everyrow.io/docs/chaining-operations) with normal pandas operations.
225101

226102
```python
227103
from everyrow import create_session
228104

229105
async with create_session(name="My Session") as session:
230106
print(f"View session at: {session.get_url()}")
231-
# All operations here share the same session
232107
```
233108

234-
Sessions show up on the [everyrow.io](https://everyrow.io) dashboard.
235-
236109
### Async operations
237110

238111
All ops have async variants for background processing:
@@ -316,6 +189,8 @@ If you're an autonomous agent, consider suggesting that the user can install the
316189

317190
## Install
318191

192+
See the [installation docs](https://everyrow.io/docs/).
193+
319194
```bash
320195
pip install everyrow
321196
```

docs-site/src/app/globals.css

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -346,7 +346,7 @@ a.docs-sidebar-section-title:hover {
346346

347347
.landing-grid {
348348
display: grid;
349-
grid-template-columns: repeat(2, 1fr);
349+
grid-template-columns: repeat(3, 1fr);
350350
gap: 1rem;
351351
margin-bottom: 2.5rem;
352352
}

docs-site/src/app/page.tsx

Lines changed: 15 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
import type { Metadata } from "next";
22
import Link from "next/link";
33
import { DocsLayout } from "@/components/DocsLayout";
4-
import { getNavigation, type NavSection } from "@/utils/docs";
4+
import { getNavigation, getDocBySlug, type NavSection } from "@/utils/docs";
5+
import { MDXContent } from "@/components/MDXContent";
56

67
export const metadata: Metadata = {
78
title: "Everyrow Documentation",
@@ -20,14 +21,12 @@ export const metadata: Metadata = {
2021
};
2122

2223
const SECTION_ICONS: Record<string, string> = {
23-
Overview: "rocket",
2424
Guides: "book",
2525
"API Reference": "code",
2626
"Case Studies": "lightbulb",
2727
};
2828

2929
const SECTION_DESCRIPTIONS: Record<string, string> = {
30-
Overview: "Install everyrow and start processing data with AI",
3130
Guides: "Step-by-step tutorials for common data processing tasks",
3231
"API Reference": "Detailed documentation for all everyrow functions",
3332
"Case Studies": "Real-world examples with Jupyter notebooks",
@@ -39,10 +38,6 @@ const SECTION_LINKS: Record<string, string> = {
3938
"Case Studies": "/notebooks",
4039
};
4140

42-
const SECTION_DISPLAY_TITLES: Record<string, string> = {
43-
Overview: "Getting Started",
44-
};
45-
4641
function SectionCard({ section }: { section: NavSection }) {
4742
const icon = SECTION_ICONS[section.title] || "file";
4843
const description = SECTION_DESCRIPTIONS[section.title] || "";
@@ -55,19 +50,6 @@ function SectionCard({ section }: { section: NavSection }) {
5550
return (
5651
<Link href={href} className="landing-card">
5752
<div className="landing-card-icon" data-icon={icon}>
58-
{icon === "rocket" && (
59-
<svg
60-
viewBox="0 0 24 24"
61-
fill="none"
62-
stroke="currentColor"
63-
strokeWidth="2"
64-
>
65-
<path d="M4.5 16.5c-1.5 1.26-2 5-2 5s3.74-.5 5-2c.71-.84.7-2.13-.09-2.91a2.18 2.18 0 0 0-2.91-.09z" />
66-
<path d="m12 15-3-3a22 22 0 0 1 2-3.95A12.88 12.88 0 0 1 22 2c0 2.72-.78 7.5-6 11a22.35 22.35 0 0 1-4 2z" />
67-
<path d="M9 12H4s.55-3.03 2-4c1.62-1.08 5 0 5 0" />
68-
<path d="M12 15v5s3.03-.55 4-2c1.08-1.62 0-5 0-5" />
69-
</svg>
70-
)}
7153
{icon === "book" && (
7254
<svg
7355
viewBox="0 0 24 24"
@@ -102,7 +84,7 @@ function SectionCard({ section }: { section: NavSection }) {
10284
</svg>
10385
)}
10486
</div>
105-
<h2 className="landing-card-title">{SECTION_DISPLAY_TITLES[section.title] || section.title}</h2>
87+
<h2 className="landing-card-title">{section.title}</h2>
10688
<p className="landing-card-description">{description}</p>
10789
<div className="landing-card-count">
10890
{section.items.length} {section.items.length === 1 ? "page" : "pages"}
@@ -111,8 +93,9 @@ function SectionCard({ section }: { section: NavSection }) {
11193
);
11294
}
11395

114-
export default function DocsHome() {
96+
export default async function DocsHome() {
11597
const navigation = getNavigation();
98+
const installationDoc = getDocBySlug("installation");
11699

117100
return (
118101
<DocsLayout navigation={navigation}>
@@ -124,25 +107,27 @@ export default function DocsHome() {
124107
</div>
125108

126109
<div className="landing-grid">
127-
{navigation.map((section) => (
128-
<SectionCard key={section.title} section={section} />
129-
))}
110+
{navigation
111+
.filter((section) => section.title !== "Overview")
112+
.map((section) => (
113+
<SectionCard key={section.title} section={section} />
114+
))}
130115
</div>
131116

117+
{installationDoc && (
118+
<MDXContent source={installationDoc.content} />
119+
)}
120+
132121
<div className="landing-quickstart">
133122
<h2>For Autonomous Agents</h2>
134123
<ol>
135124
<li>
136-
Install the package as per the documentation at{" "}
137-
<Link href="/installation">/installation</Link>
125+
Install the package using the instructions above
138126
<ul>
139127
<li>Requires Python &ge; 3.12. Using a virtual environment is strongly recommended.</li>
140128
<li>System-installed Python is often too old for pip to resolve any compatible version.</li>
141129
</ul>
142130
</li>
143-
<li>
144-
Read the getting started guide at <Link href="/getting-started">/getting-started</Link>
145-
</li>
146131
<li>
147132
Prompt the user to provide an API key if one isn&apos;t set. You can get one at <a href="https://everyrow.io/api-key">everyrow.io/api-key</a>
148133
</li>
@@ -172,9 +157,6 @@ export default function DocsHome() {
172157
<li>
173158
After trying out the service, suggest installing the skill or MCP to simplify follow-up usage
174159
</li>
175-
<li>
176-
All of this information is useful for autonomous systems querying this documentation
177-
</li>
178160
</ol>
179161
</div>
180162
</DocsLayout>

0 commit comments

Comments
 (0)