Skip to content

Commit f61c694

Browse files
Nirus2000buchen
authored andcommitted
Imnprove PDF-Importer use convenience constructors, normalize wrap blocks and ExtractorMatchers
Replace the verbose three-line subject lambda with the AccountTransaction(Type) convenience constructor introduced in #5614 across all PDF importers. **Additionally:** - Add BuySellEntry(PortfolioTransaction.Type) and PortfolioTransaction(Type) convenience constructors and apply the same single-line subject lambda pattern to all BuySellEntry and PortfolioTransaction subject lambdas - Convert the remaining three verbose AccountTransferEntry subject lambdas to method references (AccountTransferEntry::new) - Replace setCurrencyCode(CurrencyUnit.X) with setCurrencyCode(asCurrencyCode("X")) - Add missing t.getCurrencyCode() != null guard to bare t.getAmount() == 0 checks - Normalize all simple wrap blocks to a consistent if/return SkippedItem format, removing intermediate item variables and unused ctx parameters - Add final modifier to all var type = new DocumentType(...) declarations **Improvement Arkéa Direct Bank and ExtractorMatchers** - Extend `hasExDate(String)` to accept `null`, allowing dividend tests to explicitly assert that no ex-date is present in a document - Add `hasExDate(null)` assertions to all three dividend tests to document the absence of an ex-date in Arkéa Direct Bank documents **Normalize currency handling in DocumentContext** - Extend `hasExDate(String)` to accept `null`, allowing dividend tests to explicitly assert that no ex-date is present in a document - Add `hasExDate(null)` assertions to all three dividend tests to document the absence of an ex-date in Arkéa Direct Bank documents **Normalize currency handling in DocumentContext** - Fix missing asCurrencyCode() in ctx.put("currency", ...) in BundesschatzPDFExtractor (two document types stored the raw captured string instead of the normalized currency code) - Remove redundant asCurrencyCode() wrapping in 15 extractors where currency is read back via .documentContext("currency") — the value stored in the context is already normalized, so the second call was unnecessary (affected: ABNAMROGroup, AkfBank, AudiBank, AustrianAnadi, BarclaysBankIreland, BawagAG, BoursesDirect, Bundesschatz, Crowdestor, DKB, INGDiBa, RaiffeisenBankgruppe, SantanderConsumerBank, Tradegate, VolkswagenBank) **Normalize currency handling — fixed-currency extractors and raw section reads** - Introduce `private static final String XXX = "XXX"` constants and replace all `CurrencyUnit.XXX` object references and unwrapped bare string literals with `asCurrencyCode(XXX)` in fixed-currency extractors (affected: LimeTradingCorp, FidelityInternational, Computershare, AvivaPLC, ScorePriorityInc, Simpel, Sunrise, SutorBankGmbH, KFintech) - Wrap `v.put("currency", stripBlanks(...))` with `asCurrencyCode()` where currency is captured from raw PDF text (affected: Comdirect 4×, Commerzbank 1×, LibertyVorsorgeAG 7×) - Wrap `v.put("currency", v.get("currency1/2"))` with `asCurrencyCode()` for currency1/currency2 relay patterns (affected: Consorsbank 12×, DZBankGruppe 6×, DKB 6×, INGDiBa 6×, Postbank 6×) - Wrap `Money.of(v.get("currency/feeCurrency"), ...)` with `asCurrencyCode()` in BSDEXPDFExtractor where both currency fields are extracted locally from raw PDF text (8× across buy/sell sections) **Improve asCurrencyCode() in AbstractPDFExtractor and fix TextUtil.trim()** - Replace `currency.trim()` (Java's String.trim, strips only ASCII ≤ 0x20) with `TextUtil.trim()` (already statically imported) which additionally handles Unicode spaces such as U+00A0 (NO-BREAK SPACE) and U+FEFF (Zero-Width BOM) - Fix asymmetric bug in `TextUtil.trim()`: the trailing whitespace loop used `Character.isWhitespace()` while the leading loop used the more comprehensive `TextUtil.isWhitespace()` — U+00A0, U+202F and U+FEFF were therefore trimmed leading but not trailing; both loops now use `TextUtil.isWhitespace()` - As a side effect, `MessagesTest` now detects a stray U+202F (NARROW NO-BREAK SPACE) trailing in the Spanish translation key `AttributeSettings_LimitPrice_ColorSettings` — removed - Add `TextUtilTest.testTrim()` covering regular whitespace, U+00A0 and U+FEFF trimming on both sides - Add `AbstractPDFExtractorTest` covering: standard ISO codes, U+00A0 trimming, regular whitespace trimming, null input, and unknown-code fallback to base currency
1 parent 674546b commit f61c694

150 files changed

Lines changed: 3144 additions & 8707 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/rules/pdfextractors.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,84 @@ PDF importers extract transactions from bank and broker PDF statements.
2020
- Consistency in coding style between all extractors is more important than nice code.
2121
- Add comment-blocks with `@formatter:off` and `@formatter:on` before each section-block showing the format that is handled there.
2222

23+
## Conventions
24+
25+
**DocumentType declaration** — always `final`:
26+
```java
27+
final var type = new DocumentType("...");
28+
```
29+
30+
**subject() lambda** — always use the single-argument convenience constructor or method reference, never the verbose 3-line form:
31+
```java
32+
// correct
33+
.subject(() -> new AccountTransaction(AccountTransaction.Type.DIVIDENDS))
34+
.subject(() -> new BuySellEntry(PortfolioTransaction.Type.BUY))
35+
.subject(() -> new PortfolioTransaction(PortfolioTransaction.Type.DELIVERY_INBOUND))
36+
.subject(AccountTransferEntry::new) // no Type parameter — use method reference
37+
38+
// wrong
39+
.subject(() -> {
40+
var t = new AccountTransaction();
41+
t.setType(AccountTransaction.Type.DIVIDENDS);
42+
return t;
43+
})
44+
```
45+
46+
**setCurrencyCode** — always use `asCurrencyCode()`, never `CurrencyUnit` constants:
47+
```java
48+
t.setCurrencyCode(asCurrencyCode("EUR")); // correct
49+
t.setCurrencyCode(CurrencyUnit.EUR); // wrong
50+
```
51+
52+
**Fixed-currency extractors** — when an extractor always deals with one currency (e.g. a USD-only broker), declare a private String constant and use it with `asCurrencyCode()`. Keep `asCurrencyCode()` — it has normalization side-effects even for known codes:
53+
```java
54+
private static final String USD = "USD";
55+
56+
v.put("currency", asCurrencyCode(USD)); // correct
57+
v.put("currency", CurrencyUnit.USD); // wrong — CurrencyUnit object, not String
58+
v.put("currency", asCurrencyCode("USD")); // works but prefer named constant
59+
```
60+
61+
**DocumentContext currency** — when storing currency into the document context, always normalize with `asCurrencyCode()`. When the value is later read back via `.documentContext("currency")`, it is already normalized — do NOT wrap it with `asCurrencyCode()` again:
62+
```java
63+
// storing — always normalize
64+
.assign((ctx, v) -> ctx.put("currency", asCurrencyCode(v.get("currency")))) // correct
65+
.assign((ctx, v) -> ctx.put("currency", v.get("currency"))) // wrong
66+
67+
// reading via documentContext — already normalized, no wrapping needed
68+
.section("tax").optional()
69+
.documentContext("currency")
70+
.assign((t, v) -> {
71+
var tax = Money.of(v.get("currency"), asAmount(v.get("tax"))); // correct
72+
var tax = Money.of(asCurrencyCode(v.get("currency")), asAmount(...)); // wrong — redundant
73+
})
74+
75+
// reading via local .section — currency is raw, asCurrencyCode() is required
76+
.section("tax", "currency").optional()
77+
.assign((t, v) -> {
78+
var tax = Money.of(asCurrencyCode(v.get("currency")), asAmount(v.get("tax"))); // correct
79+
})
80+
```
81+
82+
**wrap() block** — use the canonical `if/return SkippedItem` form; no ternary, no intermediate `item` variable, no unused `ctx` parameter. Always guard the amount check with a currency code null-check:
83+
```java
84+
// AccountTransaction
85+
.wrap(t -> {
86+
if (t.getCurrencyCode() != null && t.getAmount() == 0)
87+
return new SkippedItem(new TransactionItem(t), Messages.MsgErrorTransactionTypeNotSupportedOrRequired);
88+
89+
return new TransactionItem(t);
90+
})
91+
92+
// BuySellEntry
93+
.wrap(t -> {
94+
if (t.getPortfolioTransaction().getCurrencyCode() != null && t.getPortfolioTransaction().getAmount() == 0)
95+
return new SkippedItem(new BuySellEntryItem(t), Messages.MsgErrorTransactionTypeNotSupportedOrRequired);
96+
97+
return new BuySellEntryItem(t);
98+
})
99+
```
100+
23101

24102
# Testing
25103
Test methods should be in following structure
@@ -61,8 +139,37 @@ public void testWertpapierKauf01()
61139
}
62140
```
63141

142+
`withFailureMessage(...)` assertions follow this structure — the `//` after the opening parenthesis is required:
143+
144+
```java
145+
// check failure message
146+
assertThat(results, hasItem(withFailureMessage( //
147+
Messages.MsgErrorTransactionOrderCancellationUnsupported, //
148+
dividend( //
149+
hasDate("2022-06-13"), hasExDate(null), //
150+
hasShares(10.672), //
151+
hasSource("DividendeStorno01.txt"), //
152+
hasNote("Vorgangs-Nr.: XXXXX"), //
153+
hasAmount("EUR", 7.04), hasGrossValue("EUR", 7.04), //
154+
hasTaxes("EUR", 0.00), hasFees("EUR", 0.00)))));
155+
```
156+
`skippedItem(...)` assertions follow the same structure — the `//` after the opening parenthesis is required:
157+
158+
```java
159+
// check skipped item
160+
assertThat(results, hasItem(skippedItem( //
161+
Messages.MsgErrorTransactionTypeNotSupportedOrRequired, //
162+
purchase( //
163+
hasDate("2012-07-02T00:00"), hasShares(0.00), //
164+
hasSource("Quartalsbericht06.txt"), //
165+
hasNote(null), //
166+
hasAmount("EUR", 0.00), hasGrossValue("EUR", 0.00), //
167+
hasTaxes("EUR", 0.00), hasFees("EUR", 0.00)))));
168+
```
169+
64170
- Each method has a starting `assertThat`-Block checking the counts. All 8 assertions need to be present.
65171
- Use `//` to enforce line-breaks when checking securities and transactions
66172
- Include time (hours, minutes, seconds) in `hasDate` when the source document provides it.
67173
- The `ExtractorMatchers`-class contains test assertion helpers
68174
- The fees and taxes could consist of multiple values, do not merge or consolidate them for better understanding
175+
- Every `dividend(...)` assertion must include `hasExDate(...)`. The correct value comes from the JUnit test failure message: look for `dividend with exDate = <value>` in the mismatch output. Use that exact value. If the extractor does not extract an ex-date, the actual value will be `null`.

name.abuchen.portfolio.tests/src/name/abuchen/portfolio/datatransfer/ExtractorMatchers.java

Lines changed: 18 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ public ExtractorItemMatcher(String label, Function<Extractor.Item, V> transactio
9595
@Override
9696
protected boolean matchesSafely(Extractor.Item item, Description mismatchDescription)
9797
{
98-
V tx = transaction.apply(item);
98+
var tx = transaction.apply(item);
9999
if (tx == null)
100100
{
101101
mismatchDescription.appendText("\n* not a '").appendText(label).appendText("' item"); //$NON-NLS-1$ //$NON-NLS-2$
@@ -206,7 +206,7 @@ protected boolean matchesSafely(Extractor.Item item, Description mismatchDescrip
206206
return false;
207207
}
208208

209-
boolean result = matcher.matches(item);
209+
var result = matcher.matches(item);
210210
if (!result)
211211
{
212212
matcher.describeMismatch(item, mismatchDescription);
@@ -369,7 +369,7 @@ public static Matcher<Extractor.Item> feeRefund(Matcher<Transaction>... properti
369369
@SafeVarargs
370370
public static Matcher<Extractor.Item> inboundCash(Matcher<Transaction>... properties)
371371
{
372-
return new ExtractorItemMatcher<Transaction>("cash transfer", //$NON-NLS-1$
372+
return new ExtractorItemMatcher<>("cash transfer", //$NON-NLS-1$
373373
item -> item instanceof AccountTransferItem transfer //
374374
&& transfer.getSubject() instanceof AccountTransferEntry entry
375375
? entry.getTargetTransaction()
@@ -380,7 +380,7 @@ public static Matcher<Extractor.Item> inboundCash(Matcher<Transaction>... proper
380380
@SafeVarargs
381381
public static Matcher<Extractor.Item> outboundCash(Matcher<Transaction>... properties)
382382
{
383-
return new ExtractorItemMatcher<Transaction>("cash transfer", //$NON-NLS-1$
383+
return new ExtractorItemMatcher<>("cash transfer", //$NON-NLS-1$
384384
item -> item instanceof AccountTransferItem transfer //
385385
&& transfer.getSubject() instanceof AccountTransferEntry entry
386386
? entry.getSourceTransaction()
@@ -391,7 +391,7 @@ public static Matcher<Extractor.Item> outboundCash(Matcher<Transaction>... prope
391391
@SafeVarargs
392392
public static Matcher<Extractor.Item> purchase(Matcher<Transaction>... properties)
393393
{
394-
return new ExtractorItemMatcher<Transaction>("purchase", //$NON-NLS-1$
394+
return new ExtractorItemMatcher<>("purchase", //$NON-NLS-1$
395395
item -> item instanceof BuySellEntryItem buysell //
396396
&& buysell.getSubject() instanceof BuySellEntry entry
397397
&& entry.getPortfolioTransaction().getType() == PortfolioTransaction.Type.BUY
@@ -403,7 +403,7 @@ public static Matcher<Extractor.Item> purchase(Matcher<Transaction>... propertie
403403
@SafeVarargs
404404
public static Matcher<Extractor.Item> sale(Matcher<Transaction>... properties)
405405
{
406-
return new ExtractorItemMatcher<Transaction>("sale", //$NON-NLS-1$
406+
return new ExtractorItemMatcher<>("sale", //$NON-NLS-1$
407407
item -> item instanceof BuySellEntryItem buysell //
408408
&& buysell.getSubject() instanceof BuySellEntry entry
409409
&& entry.getPortfolioTransaction().getType() == PortfolioTransaction.Type.SELL
@@ -415,7 +415,7 @@ public static Matcher<Extractor.Item> sale(Matcher<Transaction>... properties)
415415
@SafeVarargs
416416
public static Matcher<Extractor.Item> inboundDelivery(Matcher<Transaction>... properties)
417417
{
418-
return new ExtractorItemMatcher<Transaction>("inbound delivery", //$NON-NLS-1$
418+
return new ExtractorItemMatcher<>("inbound delivery", //$NON-NLS-1$
419419
item -> item instanceof TransactionItem tItem //
420420
&& tItem.getSubject() instanceof PortfolioTransaction tx
421421
&& tx.getType() == PortfolioTransaction.Type.DELIVERY_INBOUND ? tx : null, //
@@ -425,7 +425,7 @@ public static Matcher<Extractor.Item> inboundDelivery(Matcher<Transaction>... pr
425425
@SafeVarargs
426426
public static Matcher<Extractor.Item> outboundDelivery(Matcher<Transaction>... properties)
427427
{
428-
return new ExtractorItemMatcher<Transaction>("outbound delivery", //$NON-NLS-1$
428+
return new ExtractorItemMatcher<>("outbound delivery", //$NON-NLS-1$
429429
item -> item instanceof TransactionItem tItem //
430430
&& tItem.getSubject() instanceof PortfolioTransaction tx
431431
&& tx.getType() == PortfolioTransaction.Type.DELIVERY_OUTBOUND ? tx : null, //
@@ -435,7 +435,7 @@ public static Matcher<Extractor.Item> outboundDelivery(Matcher<Transaction>... p
435435
@SafeVarargs
436436
public static Matcher<Extractor.Item> securityTransfer(Matcher<Transaction>... properties)
437437
{
438-
return new ExtractorItemMatcher<Transaction>("securityTransfer", //$NON-NLS-1$
438+
return new ExtractorItemMatcher<>("securityTransfer", //$NON-NLS-1$
439439
item -> item instanceof PortfolioTransferItem transfer //
440440
&& transfer.getSubject() instanceof PortfolioTransferEntry entry
441441
? entry.getSourceTransaction()
@@ -451,7 +451,7 @@ public static Matcher<Transaction> hasSecurity(Matcher<Security>... properties)
451451

452452
public static Matcher<Transaction> hasDate(String dateString)
453453
{
454-
LocalDateTime expectecd = dateString.contains("T") //$NON-NLS-1$
454+
var expectecd = dateString.contains("T") //$NON-NLS-1$
455455
? LocalDateTime.parse(dateString)
456456
: LocalDate.parse(dateString).atStartOfDay();
457457

@@ -462,9 +462,10 @@ public static Matcher<Transaction> hasDate(String dateString)
462462

463463
public static Matcher<Transaction> hasExDate(String dateString)
464464
{
465-
var expected = dateString.contains("T") //$NON-NLS-1$
466-
? LocalDateTime.parse(dateString)
467-
: LocalDate.parse(dateString).atStartOfDay();
465+
var expected = dateString == null ? null
466+
: dateString.contains("T") //$NON-NLS-1$
467+
? LocalDateTime.parse(dateString)
468+
: LocalDate.parse(dateString).atStartOfDay();
468469

469470
return new PropertyMatcher<>("exDate", //$NON-NLS-1$
470471
expected, //
@@ -474,7 +475,7 @@ public static Matcher<Transaction> hasExDate(String dateString)
474475
public static Matcher<Transaction> hasShares(double value)
475476
{
476477
// work with BigDecimal to have better assertion failed messages
477-
return new PropertyMatcher<Transaction, BigDecimal>("shares", //$NON-NLS-1$
478+
return new PropertyMatcher<>("shares", //$NON-NLS-1$
478479
BigDecimal.valueOf(value).setScale(Values.Share.precision(), Values.MC.getRoundingMode()), //
479480
tx -> BigDecimal.valueOf(tx.getShares()).divide(Values.Share.getBigDecimalFactor(), Values.MC)
480481
.setScale(Values.Share.precision(), Values.MC.getRoundingMode()));
@@ -511,7 +512,7 @@ public static Matcher<Transaction> hasForexGrossValue(String currencyCode, doubl
511512
return new PropertyMatcher<>("forexGrossValue", //$NON-NLS-1$
512513
Money.of(currencyCode, Values.Amount.factorize(value)), //
513514
tx -> {
514-
Unit grossValueUnit = tx.getUnit(Unit.Type.GROSS_VALUE).orElseThrow(AssertionError::new);
515+
var grossValueUnit = tx.getUnit(Unit.Type.GROSS_VALUE).orElseThrow(AssertionError::new);
515516
return grossValueUnit.getForex();
516517
});
517518
}
@@ -547,7 +548,7 @@ public static Matcher<Transaction> check(Consumer<Transaction> check)
547548
@SafeVarargs
548549
public static Matcher<Extractor.Item> security(Matcher<Security>... properties)
549550
{
550-
return new ExtractorItemMatcher<Security>("security", //$NON-NLS-1$
551+
return new ExtractorItemMatcher<>("security", //$NON-NLS-1$
551552
item -> item instanceof SecurityItem securityItem ? securityItem.getSecurity() : null,
552553
properties);
553554
}
@@ -590,7 +591,7 @@ public static Matcher<Security> hasFeed(String quoteFeed)
590591

591592
public static Matcher<Security> hasFeedProperty(String name, String value)
592593
{
593-
return new PropertyMatcher<Security, String>("feedProperty " + name, value, //$NON-NLS-1$
594+
return new PropertyMatcher<>("feedProperty " + name, value, //$NON-NLS-1$
594595
s -> s.getPropertyValue(SecurityProperty.Type.FEED, name).orElse(null));
595596
}
596597

name.abuchen.portfolio.tests/src/name/abuchen/portfolio/datatransfer/ExtractorMatchersTest.java

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ public class ExtractorMatchersTest
4949
@BeforeClass
5050
public static void setup()
5151
{
52-
Security s = new Security();
52+
var s = new Security();
5353
s.setName("test name");
5454
s.setIsin("DE01");
5555
s.setWkn("WKN");
@@ -58,7 +58,7 @@ public static void setup()
5858

5959
someSecurity = new SecurityItem(s);
6060

61-
AccountTransaction tx = new AccountTransaction();
61+
var tx = new AccountTransaction();
6262
tx.setType(AccountTransaction.Type.DIVIDENDS);
6363
tx.setCurrencyCode(CurrencyUnit.EUR);
6464
tx.setAmount(Values.Amount.factorize(100));
@@ -87,7 +87,7 @@ public void testEmptyList()
8787
@Test(expected = AssertionError.class)
8888
public void testList()
8989
{
90-
List<Extractor.Item> items = new ArrayList<Extractor.Item>();
90+
List<Extractor.Item> items = new ArrayList<>();
9191

9292
items.add(new Extractor.TransactionItem(new AccountTransaction()));
9393

@@ -97,11 +97,11 @@ public void testList()
9797
@Test(expected = AssertionError.class)
9898
public void testDividend()
9999
{
100-
List<Extractor.Item> items = new ArrayList<Extractor.Item>();
100+
List<Extractor.Item> items = new ArrayList<>();
101101

102102
items.add(someTransactionItem);
103103

104-
AccountTransaction tx = new AccountTransaction();
104+
var tx = new AccountTransaction();
105105
tx.setType(AccountTransaction.Type.DIVIDENDS);
106106
tx.setCurrencyCode(CurrencyUnit.EUR);
107107
tx.setAmount(1000);
@@ -278,6 +278,24 @@ public void testExDateSuccess()
278278
assertThat(List.of(someTransactionItem), hasItem(dividend(hasExDate("2023-04-28"))));
279279
}
280280

281+
@Test(expected = AssertionError.class)
282+
public void testExDateNullFailure()
283+
{
284+
assertThat(List.of(someTransactionItem), hasItem(dividend(hasExDate(null))));
285+
}
286+
287+
@Test
288+
public void testExDateNullSuccess()
289+
{
290+
var tx = new AccountTransaction();
291+
tx.setType(AccountTransaction.Type.DIVIDENDS);
292+
tx.setCurrencyCode(CurrencyUnit.EUR);
293+
tx.setAmount(Values.Amount.factorize(100));
294+
tx.setDateTime(LocalDateTime.parse("2023-04-30T12:45"));
295+
296+
assertThat(List.of(new TransactionItem(tx)), hasItem(dividend(hasExDate(null))));
297+
}
298+
281299
@Test(expected = AssertionError.class)
282300
public void testSkippedItemNotSkipped()
283301
{

0 commit comments

Comments
 (0)