Commit 061268c
authored
feat(channel): inbound rich Parts across IMs (#644)
* feat(pipeline): consume inbound MessagePart in adapt
Adapters that populate Message.Parts (currently only Telegram, for mentions)
had their structured data silently dropped: adaptMessage and adaptEdit only
read msg.Message.Text, so the canonical ContentNode tree the renderer is
designed to consume was never built.
Add adaptBody/adaptParts/partToNode that translate channel.MessagePart into
the ContentNode shape the renderer already supports (mention/link/pre/code
plus nested bold/italic/strikethrough wrappers). Fall back to the existing
plain-text path when Parts is empty.
Telegram inbound mentions now reach the LLM as <mention uid="..."> instead
of being flattened to plain text.
* feat(telegram): extract bold/italic/code/link entities as inbound parts
The previous extractor only recognised mention/text_mention, so Telegram
formatting (bold, italic, code, code blocks, text_link, bare URLs) was
flattened to plain text before reaching the pipeline. With the renderer
now consuming MessagePart via adaptBody, populating the full entity set
lets the LLM see the user's formatting intent.
The new extractor walks rune offsets, sorts entities by (offset asc,
length desc) so an outer span wins on overlap, emits plain-text Parts
for the gaps between entities, and returns nil when the result carries
no rich content so callers fall back to the Text field unchanged.
* feat(feishu): emit inbound post structure as MessagePart
Feishu post messages already carry a structured tree (lines of typed
elements: text with style, link, at-mention, code block). The inbound
path flattened it to a space-joined string, so style, link URLs, and
@user IDs never reached the pipeline.
extractFeishuPostParts walks the same lines/parts structure as the
existing text and attachment extractors, translates each tag (text +
style array, a, at, code_block) into channel.MessagePart, and inserts
a newline text part between lines so the LLM sees the paragraph break.
Returns nil for single-line all-unstyled posts so callers fall back to
the plain Text field unchanged.
* fix(telegram): slice entities by UTF-16 code units
Telegram entity offset and length are documented as UTF-16 code units,
but the parser indexed into []rune, which under-counts every
supplementary-plane character (most emoji) by one position. A bold
entity following 🎉 in the message would land one character left of
its real target, so the LLM would see "old" wrapped in <b> instead of
"bold", and any further entity in the same message would compound the
drift.
Switch the parser to walk utf16.Encode/Decode of the text, matching
the Telegram spec. The BMP-only test is kept (CJK still indexes the
same under either model) and the surrogate-pair case is now covered by
TestExtractTelegramMessageParts_HandlesSupplementaryPlaneEmoji.
* fix(telegram): bump inbound Format to rich when Parts populate
toInboundTelegramMessage hard-coded MessageFormatPlain regardless of
whether entity parsing produced rich Parts, so an inbound message with
bold/italic/code spans reached downstream consumers tagged as plain
even though Parts already carried the structure. Discord/Slack/Feishu
inbound flip Format to Rich in the same condition; align Telegram.
Also rename the local variable from mentionParts to richParts now that
the extractor covers all supported entity types, not just mentions.
* fix(pipeline): preserve whitespace-only text parts in adapt
adaptParts rejected any text MessagePart whose body trimmed to empty,
so structured-post adapters that interleave content with newline
separators (Feishu emits a "\n" text Part between lines) lost the
separators and the agent saw `line1line2` instead of `line1\nline2`.
Only drop literally empty parts now; whitespace-only spans flow through
unchanged. The existing single-line "all-plain-no-styles → nil" guards
on the adapter side still elide whole-message whitespace runs, so this
loosening only matters once at least one rich span makes the message
non-trivial.
* fix(feishu): keep text from unknown post tags
extractFeishuPostText's default branch reads part["text"] for any
unrecognised tag, so the legacy text path always surfaced user-visible
copy from forward-compat tags. The new Parts builder's default branch
silently dropped them, so a post mixing a styled element (which flips
adaptBody to the Parts path) with an unknown text-bearing tag lost
that content from the LLM context entirely.
Forward text-bearing unknown tags as plain text Parts. rich=false stays
so this alone never promotes an all-plain post into the rich path.
* fix(telegram): preserve sender text on text_mention parts
The text_mention branch overwrote the entity slice with "@" plus the
linked profile's first name. When a sender anchored a tg://user link
to a custom label such as "the reviewer", the LLM saw "@alice"
instead of what was actually written; the link target also lost the
visible context the sender chose.
Use the entity slice as the mention display, surface the linked user's
id and profile via Metadata, and set ChannelIdentityID to the platform
user id so downstream identity rendering still works.
* fix(telegram): split outer entity around nested link/mention
Telegram delivers overlapping entities natively (a bold span covers
the whole rendered text, a text_link/text_mention pin a sub-range).
The cursor guard dropped any nested entity, so a bold-with-link
message reached the LLM as bold text only — the URL/identity signal
was lost, even though the wire payload carried it.
Pre-pass to identify each structural entity's smallest container. In
the main loop, when an outer entity has nested structural children,
emit the outer in segments around the children: lead styled run, then
the child (link or mention with full URL/ChannelIdentityID), then the
tail styled run. The flat MessagePart schema still can't carry both
the outer style and the link at the same position, so the link span
itself appears without the outer's style — but the URL is preserved,
and the surrounding text keeps the user's emphasis.
* fix(telegram): treat coextensive style+structural as nested
The split pre-pass skipped any candidate whose range exactly matched the
structural entity's. A fully bold link (bold and text_link covering the
identical span) therefore had no recognised parent, the main loop emitted
only the styled span, and the cursor guard then dropped the text_link —
the URL never reached the LLM even though Telegram delivered it.
Drop the equality exclusion so a style entity on the same range counts
as the structural's parent, and explicitly skip structural candidates
when picking a parent so two coextensive links don't mutually adopt each
other and disappear. The split path then emits just the link Part (the
flat schema still can't carry both style and link at the same span; the
URL wins).1 parent 54609a3 commit 061268c
9 files changed
Lines changed: 1396 additions & 54 deletions
File tree
- internal
- channel/adapters
- feishu
- telegram
- pipeline
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
54 | 58 | | |
55 | 59 | | |
56 | 60 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
Lines changed: 186 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
0 commit comments