Skip to content

Correct system prompt statutes and fix evaluation dataset#325

Open
yangm2 wants to merge 1 commit intocodeforpdx:mainfrom
yangm2:pr/system-prompt-and-dataset
Open

Correct system prompt statutes and fix evaluation dataset#325
yangm2 wants to merge 1 commit intocodeforpdx:mainfrom
yangm2:pr/system-prompt-and-dataset

Conversation

@yangm2
Copy link
Copy Markdown
Contributor

@yangm2 yangm2 commented Apr 18, 2026

What type of PR is this? (check all applicable)

  • Bug Fix
  • Documentation Update

Description

Corrections to the system prompt and evaluation dataset, reviewed with the project lawyer.

system_prompt.md:

  • Add exact verbatim statute quotes for ORS 90.394(1) (week-to-week notice timing), ORS 90.425(3)(6)(b)(8) (abandoned property), ORS 90.325(3)(b)(4) (victim damage liability), and ORS 90.245 (lease cannot waive tenant rights)
  • Add ORS 90.395(2)/(3)(a) rental assistance notice requirement to eviction notice checklist
  • Clarify PCC 30.01.087 Portland security deposit interest rule
  • Call out that ORS 90.155 explicitly excludes ORS 90.425 (no "post" option for abandoned property notices)
  • Reorganize behavioral defaults and grounding rules for clarity

dataset-tenant-legal-qa-examples.jsonl:

  • Import domestic violence scenario (scenario 3) based on ORS 90.325
  • Fix scenario 1 (abandoned property): correct delivery methods (no "post/attachment" option), state both contact windows (5 days after personal delivery / 8 days after mailing), clarify 15-day pickup window begins after tenant responds — not as an alternative to the contact deadline

evaluators/legal_correctness.md:

  • Add statute quotes and citation-trap notes to help the LLM judge verify citations accurately

Related Tickets & Documents

  • Related Issue #
  • Closes #

QA Instructions, Screenshots, Recordings

No code changes. Review system_prompt.md and dataset-tenant-legal-qa-examples.jsonl against ORS 90.425(3), (6)(b), and (8).

Added/updated tests?

  • No, and this is why: markdown and dataset-only changes, no runnable logic

Documentation

  • If this PR changes the system architecture, Architecture.md has been updated

[optional] Are there any post deployment tasks we need to perform?

After merging, re-upload the dataset to LangSmith:

cd backend
uv run python -m evaluate.create_langsmith_dataset

system_prompt.md:
- add exact statute quotes for ORS 90.394(1), 90.425(3)(6)(b)(8), 90.325(3)(b)(4), 90.245
- add ORS 90.395(2) rental assistance notice requirement
- clarify PCC 30.01.087 Portland security deposit interest rule
- call out citation traps (ORS 90.155 excluded from ORS 90.425)
- reorganize behavioral defaults and grounding rules

dataset:
- import domestic violence scenario (scenario 3)
- fix scenario 1 (abandoned property): correct delivery methods (no 'post' option),
  state both contact windows (5 days personal / 8 days mailed), clarify 15-day
  pickup window starts after tenant responds
- add statute quotes and citation-trap notes to legal_correctness evaluator
@yangm2 yangm2 requested a review from TruMichael-jpg April 18, 2026 01:53
@yangm2 yangm2 self-assigned this Apr 18, 2026
@yangm2 yangm2 added bug Something isn't working documentation Improvements or additions to documentation labels Apr 18, 2026
@yangm2 yangm2 mentioned this pull request Apr 18, 2026
11 tasks
Copy link
Copy Markdown
Contributor Author

@yangm2 yangm2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have skimmed these changes

Copy link
Copy Markdown

@TruMichael-jpg TruMichael-jpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously the main system prompt will be an iterative effort -- feel free to merge and/or accept my suggested changes here as needed.

**Behavioral defaults:**
- Give full, detailed answers; limit responses to under {RESPONSE_WORD_LIMIT} words whenever possible.
- Ask only one question at a time so the user isn't confused.
- Assume the user is on a month-to-month lease unless they specify otherwise.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Assume the user is on a month-to-month lease unless they specify otherwise.
- Assume the user is on a month-to-month tenancy unless they specify otherwise or unless the answer to their question would change if they are on a week-to-week tenancy or in the middle of a lease agreement, and if the latter, ask the user to confirm.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusting some of the language here since this is a key factor and should not always be assumed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, let me test this change locally to see how this affects the evaluations.

- When evaluating an eviction notice for nonpayment, always check: (1) whether the required notice period was given, (2) whether the notice was served on a legally permitted day relative to the start of the rental period — this varies by lease type (week-to-week and month-to-month tenancies have different rules under Oregon law), (3) whether proper service methods were used, and (4) whether the landlord included the required rental assistance notice under [ORS 90.395](https://oregon.public.law/statutes/ors_90.395)(2) — failure to deliver it is grounds for court dismissal of the eviction complaint under [ORS 90.395](https://oregon.public.law/statutes/ors_90.395)(3)(a).
- When the user states a position that their landlord (or another party) disputes, directly confirm or refute it using the retrieved law.
- City laws override state laws when there is a conflict. If the user is in a specific city, check for relevant city laws.
- If the user is being evicted for non-payment of rent, is too poor to pay, and you have confirmed the notice and court hearing date are valid, tell them to call Oregon Law Center at {OREGON_LAW_CENTER_PHONE_NUMBER}.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add the Referrals page that MZ put together here instead of the OLC phone number?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should eventually use the referrals that @michaelzhang43 sent out. I think we'll need a little more than a system prompt to support the qualifications, but we can put the initial static list in. I'll create an issue to do that in a future PR.

- Give full, detailed answers; limit responses to under {RESPONSE_WORD_LIMIT} words whenever possible.
- Ask only one question at a time so the user isn't confused.
- Assume the user is on a month-to-month lease unless they specify otherwise.
- Focus on finding technicalities that would legally prevent someone getting evicted, such as deficiencies in notice.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Focus on finding technicalities that would legally prevent someone getting evicted, such as deficiencies in notice.
- Focus on finding technicalities that would legally prevent someone getting evicted, such as deficiencies in notice.
- If the user asks about a particular action they need to take or the information provided in your response includes actions or tasks that a tenant in the user's situation needs to take, include as much detail as needed for them to actually take the action (for example, if a notice needs to be sent, include details as to when, where and how such notice must be sent under the statute). If you do not have enough information to give such instructions, ask the user factual questions until you do.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be too ambitious or not necessary at this point... but I think we ought to move in this direction to make the bot more of a first-aid resource and prevent it from from hyper-focusing on legal arguments the tenant would have to make in eviction court at the expense of practical actions they could take to resolve their situation (with an understanding of their rights).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. Maybe this suggests a new evaluator? Like a "does a layperson know what to do with this information?" Evaluator/Rubric 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants