Reply Extraction
Automatic extraction of reply text from quoted email chains
Overview
When someone replies to an email, their mail client usually includes the entire conversation history as quoted text:
Thanks for the update!
On March 15, 2026 at 2:30 PM, Agent <agent@daimon.email> wrote:
> Hi John,
>
> Here's the report you requested.
>
> Best,
> AgentFor AI agents processing replies, the quoted history is noise — you only want the fresh reply text: "Thanks for the update!"
daimon.email automatically extracts clean reply text using TalonJS, a battle-tested library for separating replies from quoted content.
Info
Reply extraction is automatic for all inbound emails. The reply_body field contains just the new reply text, while text_body contains the full message including quoted history.
How Reply Extraction Works
When an email arrives at daimon.email:
- Parse raw MIME - Extract headers and body
- Detect quoted content - Identify signature delimiters, quote markers, forwarding lines
- Extract reply text - Separate fresh content from quoted history
- Normalize formatting - Remove excessive whitespace, normalize line breaks
- Store both versions -
reply_body(clean) andtext_body(full)
graph LR
A[Raw Email] --> B[TalonJS]
B --> C[reply_body<br/>'Thanks!']
B --> D[text_body<br/>'Thanks!<br/>On March 15...']
style C fill:#27ae60
style D fill:#95a5a6Fields Explained
| Field | Description | Use Case |
|---|---|---|
text_body | Full email body including quoted history | Debugging, archiving, full context |
reply_body | Clean reply text only, no quotes | AI processing, sentiment analysis, response generation |
html_body | Full HTML version (if available) | Rendering in web UI |
{
"message_id": "msg_9kL2xYw1pQm3",
"from": "customer@company.com",
"to": ["agent@daimon.email"],
"subject": "Re: Support ticket #1234",
"text_body": "Thanks for the quick fix!\n\nOn March 15, 2026 at 2:30 PM, Agent <agent@daimon.email> wrote:\n> Hi,\n> I've resolved the issue. Please try again.\n> Best,\n> Agent",
"reply_body": "Thanks for the quick fix!",
"html_body": "<div>Thanks for the quick fix!</div><div class='gmail_quote'>On March 15, 2026 at 2:30 PM, Agent <agent@daimon.email> wrote:<blockquote>Hi,<br>I've resolved the issue...</blockquote></div>"
}Using Reply Text in Agents
Most AI agents should process reply_body, not text_body:
app.post('/webhooks/daimon', async (req, res) => {
const event = req.body;
if (event.type === 'message.received') {
const message = event.data;
// ✅ Use reply_body for AI processing
const sentiment = await analyzeSentiment(message.reply_body);
const intent = await classifyIntent(message.reply_body);
console.log('Customer replied:', message.reply_body);
console.log('Sentiment:', sentiment); // "positive"
console.log('Intent:', intent); // "gratitude"
// Generate response based on clean reply
const response = await generateResponse(message.reply_body);
await client.inboxes.send(message.inbox_id, {
to: message.from,
subject: `Re: ${message.subject}`,
text: response,
thread_id: message.thread_id,
});
}
res.json({ received: true });
});@app.route('/webhooks/daimon', methods=['POST'])
def handle_webhook():
event = request.json
if event['type'] == 'message.received':
message = event['data']
# ✅ Use reply_body for AI processing
sentiment = analyze_sentiment(message['reply_body'])
intent = classify_intent(message['reply_body'])
print(f"Customer replied: {message['reply_body']}")
print(f"Sentiment: {sentiment}")
print(f"Intent: {intent}")
# Generate response
response = generate_response(message['reply_body'])
client.inboxes.send(
inbox_id=message['inbox_id'],
to=message['from'],
subject=f"Re: {message['subject']}",
text=response,
thread_id=message['thread_id']
)
return {'received': True}Signature Detection
TalonJS automatically detects and removes email signatures:
Common Signature Patterns
Thanks!
--
John Doe
VP of Engineering
Acme Corp
john@acme.com | (555) 123-4567Best regards,
Jane Smith
Sent from my iPhoneCheers,
Bob
________________________________
This email and any attachments are confidential...All of these are stripped from reply_body, leaving only: "Thanks!", "Best regards,", "Cheers,"
Custom Signature Delimiters
TalonJS recognizes:
--(standard delimiter)___(Outlook-style)Sent from my iPhone/AndroidThanks,\n[Name]\n[Title]patterns- Confidentiality notices
- Legal disclaimers
Note
If your use case requires preserving signatures, use text_body instead of reply_body.
Quote Styles Supported
TalonJS handles multiple quote formats:
Gmail/Modern Clients
Great, thanks!
On Tue, Mar 15, 2026 at 2:30 PM Agent <agent@daimon.email> wrote:
> Hi John,
> Here's the update.Extracted: "Great, thanks!"
Outlook
Perfect, thank you.
________________________________
From: Agent <agent@daimon.email>
Sent: Tuesday, March 15, 2026 2:30 PM
To: John Doe
Subject: Re: Project update
Hi John,
Here's the update.Extracted: "Perfect, thank you."
Plain Text (> quotes)
Sounds good!
> Hi,
> Here's the info you requested.
> Thanks,
> AgentExtracted: "Sounds good!"
Forwarded Messages
FYI - see below.
---------- Forwarded message ---------
From: Agent <agent@daimon.email>
Date: Tue, Mar 15, 2026 at 2:30 PM
Subject: Report
To: john@company.com
Here's the quarterly report.Extracted: "FYI - see below."
Link Extraction
In addition to reply text, daimon.email extracts links from emails:
Standard Links
All URLs in the email are extracted to the links array:
{
"reply_body": "Check out our docs at https://docs.daimon.email for more info.",
"links": [
"https://docs.daimon.email"
]
}CTA Links (Call-to-Action)
daimon.email uses pattern matching to identify special action links:
- Verification URLs:
https://example.com/verify?token=... - Confirmation links:
https://example.com/confirm/... - Password reset:
https://example.com/reset-password?code=... - Unsubscribe:
https://example.com/unsubscribe?id=... - Magic links:
https://example.com/login?magic=...
These are extracted to the cta_links array for easy agent access:
{
"message_id": "msg_9kL2xYw1pQm3",
"from": "noreply@service.com",
"subject": "Verify your email address",
"reply_body": "Click here to verify: https://service.com/verify?token=abc123xyz\n\nIf you didn't request this, ignore this email.",
"links": [
"https://service.com/verify?token=abc123xyz"
],
"cta_links": [
{
"url": "https://service.com/verify?token=abc123xyz",
"type": "verification",
"confidence": 0.95
}
]
}// Handle verification email
app.post('/webhooks/daimon', async (req, res) => {
const event = req.body;
if (event.type === 'message.received') {
const message = event.data;
// Check for verification links
const verifyLink = message.cta_links.find(
link => link.type === 'verification'
);
if (verifyLink) {
console.log('Verification link detected:', verifyLink.url);
// Agent can automatically click verification links
await fetch(verifyLink.url, { method: 'GET' });
console.log('Email verified automatically');
}
}
res.json({ received: true });
});@app.route('/webhooks/daimon', methods=['POST'])
def handle_webhook():
event = request.json
if event['type'] == 'message.received':
message = event['data']
# Check for verification links
verify_links = [
link for link in message['cta_links']
if link['type'] == 'verification'
]
if verify_links:
verify_url = verify_links[0]['url']
print(f"Verification link detected: {verify_url}")
# Agent can auto-click verification
requests.get(verify_url)
print('Email verified automatically')
return {'received': True}CTA Link Types
| Type | Example Pattern | Confidence Threshold |
|---|---|---|
verification | /verify?token=, /confirm-email | >0.85 |
password_reset | /reset-password, /set-password | >0.85 |
unsubscribe | /unsubscribe, list-unsubscribe | >0.90 |
magic_link | /login?magic=, /authenticate?token= | >0.80 |
confirmation | /confirm/, /accept-invite | >0.80 |
Info
The confidence score (0-1) indicates how certain the classifier is. Thresholds above 0.8 are generally safe to act on autonomously.
Edge Cases and Limitations
Multi-level Quotes
TalonJS handles nested quotes (replies to replies), but only extracts the topmost reply:
I agree with that approach.
On March 16 at 3pm, Jane wrote:
> Sounds good to me!
>
> On March 15 at 2pm, Agent wrote:
>> Here's the proposal.Extracted: "I agree with that approach."
Interleaved Replies
Some users reply inline (between quoted lines). TalonJS may not perfectly handle these:
> What's the status of task A?
Task A is done.
> What about task B?
Task B is still in progress.Extracted: May include partial quotes. Review text_body for full context.
Note
For complex interleaved replies, consider using the full text_body and parsing with custom logic or an LLM prompt.
HTML-only Emails
Some emails only have HTML (no plain text). TalonJS extracts from HTML but may miss styling context:
<div style="color: red; font-weight: bold;">URGENT: Please respond ASAP</div>
<div class="gmail_quote">On March 15...</div>Extracted: "URGENT: Please respond ASAP" (styling lost)
If styling matters (e.g., detecting urgency by color), parse html_body directly.
Debugging Reply Extraction
Compare reply_body vs text_body to verify extraction:
// Fetch message
const message = await client.inboxes.getMessage(inboxId, messageId);
console.log('=== FULL TEXT ===');
console.log(message.text_body);
console.log('\n=== EXTRACTED REPLY ===');
console.log(message.reply_body);
console.log('\n=== LINKS ===');
console.log(message.links);
console.log('\n=== CTA LINKS ===');
console.log(message.cta_links);# Fetch message
message = client.inboxes.get_message(inbox_id, message_id)
print('=== FULL TEXT ===')
print(message['text_body'])
print('\n=== EXTRACTED REPLY ===')
print(message['reply_body'])
print('\n=== LINKS ===')
print(message['links'])
print('\n=== CTA LINKS ===')
print(message['cta_links'])Fallback Strategy
If reply_body is empty or seems incorrect, fall back to text_body:
async function getReplyText(message: Message): Promise<string> {
// Use reply_body if available and non-empty
if (message.reply_body && message.reply_body.trim().length > 10) {
return message.reply_body;
}
// Fall back to full text
console.warn('reply_body empty or too short, using text_body');
return message.text_body;
}Custom Extraction (Advanced)
If TalonJS doesn't meet your needs, process text_body with custom logic or an LLM:
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
async function extractReplyWithLLM(fullText: string): Promise<string> {
const response = await anthropic.messages.create({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{
role: 'user',
content: `Extract only the fresh reply text from this email, excluding any quoted history, signatures, or disclaimers:\n\n${fullText}`,
}],
});
return response.content[0].text;
}
// Use custom extraction
const customReply = await extractReplyWithLLM(message.text_body);
console.log('LLM-extracted reply:', customReply);import anthropic
client = anthropic.Anthropic(api_key=os.environ['ANTHROPIC_API_KEY'])
def extract_reply_with_llm(full_text: str) -> str:
response = client.messages.create(
model='claude-3-5-sonnet-20241022',
max_tokens=1024,
messages=[{
'role': 'user',
'content': f"Extract only the fresh reply text from this email, excluding any quoted history, signatures, or disclaimers:\n\n{full_text}"
}]
)
return response.content[0].text
# Use custom extraction
custom_reply = extract_reply_with_llm(message['text_body'])
print(f"LLM-extracted reply: {custom_reply}")Best Practices
Use reply_body for AI Processing
Feed clean reply text to your LLM, not full conversation history. Reduces tokens and improves accuracy.
Use text_body for Archiving
Store full conversation history for debugging, compliance, or context retrieval.
Validate CTA Links Before Clicking
Check confidence scores and URL patterns before autonomously clicking verification links.
Handle Empty reply_body Gracefully
Some emails may have empty reply_body (e.g., forwarded messages with no new text). Fall back to text_body.
Performance
Reply extraction adds minimal latency:
| Step | Latency |
|---|---|
| MIME parsing | ~5ms |
| TalonJS extraction | ~15ms |
| Link extraction | ~3ms |
| Total overhead | ~23ms |
For high-volume inbound processing (>1000 emails/min), this is negligible.
Next Steps
- Sending & Receiving - Email flow overview
- Agent Autonomy Levels - How to use extracted links for autonomous actions
- API Reference: Messages - Full message object schema