Skip to content

Commit 1aa6465

Browse files
ericmigiclaude
andcommitted
Add README with field report from reverse-engineering work
Documents what we learned about Apple Notes' CloudKit + topotext CRDT proto, what works, what we tried that failed (so the next person doesn't repeat them), and the v4 UUID gotcha that took hours to find. Read order, debug recipes, code tour, and known limitations included. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent dc650f5 commit 1aa6465

1 file changed

Lines changed: 343 additions & 0 deletions

File tree

README.md

Lines changed: 343 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,343 @@
1+
# Notes of Fruit
2+
3+
An Android client for iCloud Notes. Reverse-engineered from CloudKit's private DB
4+
API + Apple's `topotext` CRDT proto. Reads and writes notes that round-trip with
5+
Mac's Apple Notes and iCloud.com.
6+
7+
This is a research codebase, not a polished product. Treat the README as a field
8+
report for the next person (or agent) who picks it up.
9+
10+
## Status
11+
12+
| Capability | Status |
13+
|---|---|
14+
| Sign in with Apple ID (cookie-based) | Works |
15+
| List notes (recents, paginated) | Works |
16+
| Folder navigation (sidebar, trash filter) | Works |
17+
| Read note content (decode topotext proto) | Works for `NOTE_STORE_PROTO`; not the modern `MergableData` shape |
18+
| Append text | Works (round-trips with Mac) |
19+
| Mid-text splice (insert/replace/delete) | Works (matches iCloud.com's slot-promotion pattern) |
20+
| Create new note (FAB → save) | Works **only with v4 UUIDs** — see "The UUID gotcha" below |
21+
| Auto-save on lifecycle pause | Works |
22+
| Share OUT (Android intent chooser) | Works (plaintext) |
23+
| Delete | Works (`forceDelete` removes the record entirely) |
24+
| Format display (headings, lists, checkboxes) | Decoded but not rendered yet (Phase D1 done; D2 pending) |
25+
| Format input toolbar | Not built |
26+
| Image attachments | Renders `` stub; no image display |
27+
| Conflict UX | Auto-retry on CONFLICT/oplock; no concurrent-edit merge yet |
28+
29+
## The most important thing you'll learn here
30+
31+
**Apple's replica UUID must be RFC 4122 v4.** This is not documented anywhere.
32+
We spent hours diffing protos before noticing.
33+
34+
When you create a note from a foreign client, Apple Notes on Mac runs notesync,
35+
adds housekeeping fields, and **silently sets `Deleted=1`** if the sole replica's
36+
UUID isn't v4 (byte 6 high nibble = 0x4, byte 8 high two bits = 0b10). The body
37+
proto is left intact. The note appears in Recently Deleted within a few minutes.
38+
39+
`SecureRandom().nextBytes(16)` produces raw bytes — only ~1/16 chance the version
40+
nibble is 4. Use `java.util.UUID.randomUUID()`, or set the bits explicitly:
41+
42+
```kotlin
43+
val bytes = ByteArray(16).also { SecureRandom().nextBytes(it) }
44+
bytes[6] = ((bytes[6].toInt() and 0x0F) or 0x40).toByte() // version = 4
45+
bytes[8] = ((bytes[8].toInt() and 0x3F) or 0x80).toByte() // variant = 10
46+
```
47+
48+
See [DeviceIdentity.kt](app/src/main/java/com/example/applenotes/auth/DeviceIdentity.kt)
49+
and commit `3ee1a60`.
50+
51+
## How Apple Notes works (the parts that matter)
52+
53+
### CloudKit layer
54+
55+
- Endpoint: `https://p<N>-ckdatabasews.icloud.com.cn` for Chinese accounts, `.com`
56+
otherwise. Path: `/database/1/com.apple.notes/production/private/...`.
57+
- Auth: cookie-based (`X-APPLE-WEBAUTH-TOKEN`). We harvest cookies via Android's
58+
`WebView` flow in [`auth/`](app/src/main/java/com/example/applenotes/auth/).
59+
- Zone: `Notes`. Record types: `Note`, `Folder`, `SearchIndexes`.
60+
- Required fields on `Note` create: `TextDataEncrypted`, `TitleEncrypted`,
61+
`SnippetEncrypted`, `Folder` (CKReference), `CreationDate`, `ModificationDate`.
62+
- **Date trap**: omit `ModificationDate` and CloudKit defaults it to .NET
63+
`DateTime.MinValue` (`-62135769600000` ms — year 1). Mac silently skips records
64+
with that sentinel. Always send `System.currentTimeMillis()`.
65+
- `recordChangeTag` is CloudKit's optimistic-concurrency token. Mac's notesync
66+
bumps it asynchronously after merging — expect `CONFLICT`/`oplock` errors on
67+
edits that follow a create. We auto-retry once with a fresh tag.
68+
69+
### Listing notes and folders
70+
71+
- `recordType=Note` query works.
72+
- `recordType=Folder` query is **rejected** by the server: "Type is not marked
73+
indexable: Folder". Use `/records/lookup` with recordNames pulled from notes'
74+
`Folder` references instead.
75+
- `SearchIndex` "recents" index exists and works. "folders" returns "No index of
76+
this name exists".
77+
- Special folder recordNames: `DefaultFolder-CloudKit` ("Notes"),
78+
`TrashFolder-CloudKit` ("Recently Deleted").
79+
- Notes in trash live under `Folder = TrashFolder-CloudKit`. The `Deleted` field
80+
is added by Mac housekeeping; don't filter by it. Filter by Folder ref.
81+
82+
### The body proto: `topotext.String`
83+
84+
`TextDataEncrypted` is gzipped (or zlibbed) protobuf. The shape:
85+
86+
```
87+
versioned_document.Document {
88+
Version version = 2 {
89+
int32 minimumSupportedVersion = 2;
90+
bytes data = 3; // encoded topotext.String
91+
}
92+
}
93+
94+
topotext.String {
95+
string string = 2; // the visible text, concatenation of live substrings
96+
repeated Substring substrings = 3;
97+
VectorTimestamp timestamp = 4; // per-replica clocks
98+
repeated AttributeRun attributeRuns = 5;
99+
}
100+
101+
Substring {
102+
CharID charID = 1; // (replicaID, clock) — globally unique
103+
uint32 length = 2;
104+
CharID timestamp = 3; // for tombstone fresh-ts ordering
105+
bool tombstone = 4;
106+
repeated int32 child = 5; // forward links into substrings[]
107+
}
108+
109+
CharID {
110+
uint32 replicaID = 1; // 1-based index into VectorTimestamp.clock
111+
uint32 clock = 2; // Lamport
112+
}
113+
```
114+
115+
Invariants (verified by sampling iCloud.com-created and Mac-created notes):
116+
117+
- The substring array always starts with **doc-start** (`charID=(0,0)`,
118+
`length=0`) and ends with the **sentinel** (`charID=(0,0xFFFFFFFF)`,
119+
`length=0`). Don't reorder.
120+
- Children are forward links into the array. The walk produces the visible-order
121+
chain: doc-start → ... → sentinel.
122+
- New inserts go between doc-start and sentinel; tombstones flip `tombstone=true`
123+
but stay in the tree.
124+
125+
### CRDT semantics in practice
126+
127+
- Each character ever inserted has a globally-unique `(replicaID, clock)` ID.
128+
`replicaID=1` is whoever wrote *this* version of the proto; the same UUID may
129+
be at a different slot in someone else's proto.
130+
- **Slot rotation** matters: when iCloud.com edits a note, it always remaps
131+
itself to slot 1, demoting everyone else. Mac's notesync uses the rotation as
132+
a "refresh local cache" signal. We mirror this in
133+
[`NoteAppender.setBodySpliceBytes`](app/src/main/java/com/example/applenotes/proto/NoteAppender.kt).
134+
- `ReplicaIDToNotesVersionDataEncrypted` is the per-replica version vector. Mac
135+
uses changes to this field as the cache-invalidation signal. Pass it through
136+
unchanged on modify (don't strip or rewrite). Without this, Mac duplicates the
137+
body on multi-substring notes after Android edits.
138+
139+
### Compression
140+
141+
Both gzip (`1f 8b`) and zlib (`78 9c`) are accepted. iCloud.com sends zlib for
142+
short notes; Mac sends gzip with OS=0x13. We send gzip with OS=0xff (Java
143+
default). All three round-trip cleanly. Compression format is **not** the cause
144+
of Mac trashing — we ruled this out by sampling.
145+
146+
### Special characters
147+
148+
- `U+2028` (LINE SEPARATOR) and `U+2029` (PARAGRAPH SEPARATOR) appear in Mac
149+
notes for soft line breaks inside lists/checkboxes. Don't strip them.
150+
- `` (U+FFFC OBJECT REPLACEMENT) is Apple's placeholder for inline attachments.
151+
We pass through but don't render the attachment.
152+
153+
### Paragraph styles (in `AttributeRun.f2.f3`)
154+
155+
| Code | Style |
156+
|---|---|
157+
| 0 | Body |
158+
| 1 | Title |
159+
| 2 | Heading |
160+
| 3 | Subheading |
161+
| 4 | Monospaced |
162+
| 100 | Bulleted list |
163+
| 101 | Checkbox (with `f4=1` if checked) |
164+
| 102 | Numbered list |
165+
166+
`f9` inside `f2` is a 16-byte paragraph UUID (Mac uses these; iCloud.com doesn't,
167+
yet both round-trip). Decode via
168+
[`NoteBodyEditor.parseAttributeRuns`](app/src/main/java/com/example/applenotes/proto/NoteBodyEditor.kt).
169+
170+
## What we tried that failed
171+
172+
These were dead ends; documenting so you don't repeat them.
173+
174+
- **Markdown rendering as a stopgap.** Tempting because it's easy, but Apple's
175+
notes don't have markdown syntax in their body bytes — they have
176+
`attributeRuns` metadata. Rendering markdown on the Android side would mean
177+
literal `**bold**` showing up everywhere on Mac. Use attribute_runs, even with
178+
partial coverage.
179+
- **Empty `createNote(title="", body="")` then `modifyNoteBody`.** This races
180+
Mac's notesync housekeeping and produces records that get trashed within
181+
minutes. Even with the v4 UUID fix this is risky. Defer createNote until the
182+
first save with content (we do; see commit `55704ae`).
183+
- **Querying for folders by `recordType=Folder`.** Server rejects with "Type is
184+
not marked indexable: Folder". Use `/records/lookup` with recordNames pulled
185+
from notes' Folder refs.
186+
- **Trusting CloudKit's `Deleted` field as the trash discriminator.** Mac
187+
housekeeping adds it, but it's also present on alive notes. Filter by Folder
188+
ref pointing at `TrashFolder-CloudKit` instead.
189+
- **`fetchRecents(1)`.** The server returns 0 results for tiny limits even when
190+
notes exist. Use ≥ 50.
191+
- **Driving the FAB via `adb shell input tap` coordinates.** Brittle — emulator
192+
scaling means coords drift between sessions. Use `uiautomator dump` if you
193+
must, or just exercise the same code paths via `DebugReceiver` broadcasts.
194+
- **Recreating the device's replica UUID per session.** Apple has a per-note
195+
replica cap and your registry will explode. Persist once, reuse forever.
196+
- **Skipping `ReplicaIDToNotesVersionDataEncrypted` on modify.** Mac duplicates
197+
the body locally after multi-substring edits. Always pass it through.
198+
199+
## What works (and how to verify)
200+
201+
| What | Test |
202+
|---|---|
203+
| Read existing Mac notes | Open any note in the app — body should match Mac |
204+
| Append edit round-trips | Add text on Android → save → check Mac after relaunch |
205+
| Mid-text splice round-trips | Edit existing text on Android → save → verify on Mac |
206+
| Create new note with v4 UUID | Tap FAB → type → back → verify note appears (and stays) on Mac |
207+
| Auto-save | Type, swipe back without explicit save → reopen → text persists |
208+
| Share OUT | Detail screen → share button → Android chooser appears |
209+
| Folder filter | Drawer → select folder → list filters; trash hidden by default |
210+
211+
The `DebugReceiver` exposes the same code paths via `adb` broadcasts:
212+
213+
```bash
214+
# List all notes
215+
adb shell am broadcast -p com.example.applenotes -a com.example.applenotes.LIST
216+
217+
# Look up a specific note by title
218+
adb shell am broadcast -p com.example.applenotes -a com.example.applenotes.LOOKUP_BY_TITLE --es title 'foo'
219+
220+
# Append to an existing note
221+
adb shell am broadcast -p com.example.applenotes -a com.example.applenotes.APPEND_BY_TITLE \
222+
--es title 'foo' --es text 'extra line'
223+
224+
# Create a new note
225+
adb shell am broadcast -p com.example.applenotes -a com.example.applenotes.CREATE \
226+
--es title 'newnote' --es body 'hello'
227+
```
228+
229+
Watch with `adb logcat -s AppleNotesClient:* AppleNotesDebug:*`.
230+
231+
## Code tour
232+
233+
```
234+
app/src/main/java/com/example/applenotes/
235+
├── auth/ WebView cookie harvest, session refresh, DeviceIdentity (v4 UUID!)
236+
├── client/ AppleNotesClient: HTTP layer, fetchRecents/lookupNote/createNote/modifyNoteBody
237+
├── proto/
238+
│ ├── ProtobufWire.kt Hand-rolled protobuf wire format (no schema dep)
239+
│ ├── Gzip.kt gzip + zlib detect/compress/decompress
240+
│ ├── NoteBodyEditor.kt Decode topotext.String → visible text + attribute runs
241+
│ ├── NoteAppender.kt Append / splice / setBody — produces new proto bytes
242+
│ └── NoteCreator.kt Build proto for a fresh note
243+
├── debug/DebugReceiver.kt Broadcast-driven CLI for testing without UI
244+
└── ui/AppleNotesApp.kt Compose UI: list, detail, FAB draft state, drawer, save flow
245+
```
246+
247+
Read order if you're new:
248+
249+
1. `NoteBodyEditor.kt` — get a feel for what a substring tree looks like.
250+
2. `AppleNotesClient.kt` — see the wire shape of every CloudKit op.
251+
3. `NoteAppender.setBodySpliceBytes` — the slot-promotion pattern in detail.
252+
4. `AppleNotesApp.kt``ScreenState.Detail` and the deferred-create FAB flow.
253+
254+
## Build / run
255+
256+
```bash
257+
./gradlew :app:assembleDebug
258+
adb install -r app/build/outputs/apk/debug/app-debug.apk
259+
adb shell am start -n com.example.applenotes/.MainActivity
260+
```
261+
262+
You'll need an Android emulator or device with Google Play Services. Sign in
263+
with an Apple ID that has Notes enabled in iCloud. The first launch goes through
264+
the WebView auth flow.
265+
266+
## Advice for the next person/agent
267+
268+
**Use iCloud.com web for A/B comparisons.** When something doesn't work, the
269+
fastest way to figure out what Apple expects is to make iCloud.com produce the
270+
same operation on a parallel test note, then diff the bytes. We caught the v4
271+
UUID issue this way after exhausting other theories.
272+
273+
**Sample multiple healthy notes before drawing conclusions.** Bisha and Money
274+
maker (Mac-created, alive) gave us a baseline. ic-create-test-001
275+
(iCloud.com-created, alive) gave us another. Newtest1-4 and Made-in-android
276+
(Android-created, trashed) showed the failure mode. The trashed ones had
277+
attributes the alive ones never had — but only the UUID was a *consistent*
278+
discriminator across all of them.
279+
280+
**Don't trust a single failed test.** Mac trashes records asynchronously, on
281+
notesync, on quit/reopen — sometimes minutes after creation. A note that "looks
282+
fine right after save" may still get trashed an hour later when Mac next runs
283+
notesync. Wait, then re-check.
284+
285+
**The CRDT is real.** char-IDs are stable across remote edits. Insertions slot
286+
in by parent char-ID, not position. Tombstones are commutative. If you're
287+
writing new code that touches the proto, encode operations as `(parentCharID,
288+
chars)` or `(charIDs to tombstone)`, not as positional diffs. This makes the
289+
v2 conflict-resolution work (re-splice on stale base) trivial — see the
290+
"What this means for our code" discussion in commit history.
291+
292+
**Match iCloud.com's payload shape, not just the field names.** When in doubt,
293+
look at what iCloud.com sends. We removed `PaperStyleType` / `AttachmentViewType`
294+
/ `Deleted` from `createNote` because iCloud.com doesn't send them — server
295+
defaults handle it. Sending zeros made the new record look "edited" and racy.
296+
297+
**Pre-flight checks before edits.** Always re-`lookupNote` to get a fresh
298+
`recordChangeTag` before submitting a modify, OR rely on the auto-retry on
299+
CONFLICT (we do the latter). The post-create race is real and consistent.
300+
301+
**Write logs that show wire shape.** `summarizeBase64` in `NoteBodyEditor` dumps
302+
ops/replicas/attr_runs in one line. That's how we caught most issues. When you
303+
add a new field, add a corresponding summary line.
304+
305+
**Don't speculate about validation rules in production.** Apple's clients run
306+
validation we can't see. The only authoritative way to know if a write is
307+
accepted is to watch what Mac does to it over time. Build the iteration loop
308+
short — `adb broadcast` → read response → `lookupNote` again — and you'll find
309+
issues fast.
310+
311+
**Read `PLAN.md` before adding new features.** It records decisions that are
312+
not obvious from the code (markdown vs attribute_runs, lifecycle-only auto-save,
313+
empty-body new-note flow). Don't relitigate without reason.
314+
315+
## Known limitations
316+
317+
- **Concurrent-edit conflicts**: the auto-retry resubmits the original body
318+
bytes, which last-writer-wins on a true concurrent edit. Safe for the
319+
post-create housekeeping race we hit in practice. The proper v2 fix is to
320+
decompose edits into char-ID-keyed ops and replay against a fresh base on
321+
conflict; the design is sketched in commit `0462bcb`'s discussion thread but
322+
not yet implemented.
323+
- **Modern `MergableData` proto is not supported.** Newer Apple Notes (iOS 17+
324+
collaborative notes?) use a different shape. We refuse to operate on those —
325+
see `NoteBodyEditor.probe`. Read-only support would be a starting point.
326+
- **Image attachments display as ``.** Decoding the `Attachment` CKRecord type
327+
is not implemented.
328+
- **No formatting toolbar yet.** We decode paragraph styles but the editor only
329+
shows plain text. UI work pending (Phase D2 in PLAN.md).
330+
- **No subscription / push.** Polls when the user opens the app. Background sync,
331+
CloudKit subscriptions, and a send queue are all unimplemented.
332+
333+
## Disclaimer
334+
335+
This project is unaffiliated with Apple. It uses Apple's public CloudKit Web
336+
Services API with the user's own credentials. It does not break encryption — it
337+
reads notes the authenticated user already has access to. The proto formats
338+
were inferred from observation of the user's own data round-tripping with their
339+
own iCloud account; no Apple-internal documentation was consulted.
340+
341+
If you're going to publish anything based on this, be respectful. Don't redistribute
342+
captured proto data from other users' accounts. Don't ship anything that would
343+
encourage account credential sharing.

0 commit comments

Comments
 (0)