Skip to content

Commit 548d1a0

Browse files
committed
5.0.0
1 parent 1e56853 commit 548d1a0

2 files changed

Lines changed: 101 additions & 42 deletions

File tree

README.md

Lines changed: 94 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ NekoHTML adds missing parent elements; automatically closes elements with option
1616
**Standards Compliant** - Follows HTML parsing specifications
1717
**Well Tested** - Over 8,000 test cases
1818
**No External Dependencies** - Pure Java implementation
19-
**Java 8+ Compatible** - Works with Java 8, 11, 17, 21 and beyond
19+
**Java 17 Compatible** - Works with Java 17, 21 and beyond
2020
**Android Support** - Runs on Android platforms
2121

2222
The **Htmlunit-NekoHtml** Parser is used by [HtmlUnit](https://htmlunit.sourceforge.io/).
@@ -31,25 +31,11 @@ The **Htmlunit-NekoHtml** Parser is used by [HtmlUnit](https://htmlunit.sourcefo
3131

3232
#### Version 5
3333

34-
Work on HtmlUnit-NekoHTML 5.0 has started. This new major version will require **JDK 17 or higher**.
34+
Starting with version 5.0.0, **JDK 17 or higher is required**.
35+
If you are still on JDK 8, see [Legacy Support (JDK 8)](#legacy-support-jdk-8) below.
3536

3637

37-
#### Legacy Support (JDK 8)
38-
39-
If you need to continue using **JDK 8**, please note that versions 4.x will remain available as-is. However,
40-
**ongoing maintenance and fixes for JDK 8 compatibility are only available through sponsorship**.
41-
42-
Maintaining separate fix versions for JDK 8 requires significant additional effort for __backporting__, testing, and release management.
43-
44-
**To enable continued JDK 8 support**, please contact me via email to discuss sponsorship options. Sponsorship provides:
45-
46-
- __Backporting__ security and bug fixes to the 4.x branch
47-
- Maintaining compatibility with older Java versions
48-
- Timely releases for critical issues
49-
50-
Without sponsorship, the 4.x branch will not receive updates. Your support ensures the long-term __sustainability__ of this project across multiple Java versions.
51-
52-
### Latest release Version 4.21.0 / December 28, 2025
38+
### Latest release Version 5.0.0 / May 24, 2026
5339

5440
##### Security Advisories
5541

@@ -69,7 +55,7 @@ Add to your `pom.xml`:
6955
<dependency>
7056
<groupId>org.htmlunit</groupId>
7157
<artifactId>neko-htmlunit</artifactId>
72-
<version>4.21.0</version>
58+
<version>5.0.0</version>
7359
</dependency>
7460
```
7561

@@ -78,7 +64,7 @@ Add to your `pom.xml`:
7864
Add to your `build.gradle`:
7965

8066
```groovy
81-
implementation group: 'org.htmlunit', name: 'neko-htmlunit', version: '4.21.0'
67+
implementation group: 'org.htmlunit', name: 'neko-htmlunit', version: '5.0.0'
8268
```
8369

8470
## HowTo use
@@ -301,6 +287,21 @@ This includes both tags present in the source HTML and tags automatically insert
301287
- **Error Reporter**: Implementing a custom error reporter allows you to handle parsing errors according to your application's needs.
302288
- **Name Casing**: Changing element/attribute name casing affects the output and may impact CSS selectors or JavaScript that relies on specific casing.
303289

290+
291+
<a name="legacy-support-jdk-8"></a>
292+
### Legacy Support (JDK 8)
293+
294+
If you need to continue using **JDK 8**, versions 4.x remain available as-is.
295+
Ongoing maintenance and fixes for JDK 8 are only available through sponsorship —
296+
please contact me via email to discuss options. Sponsorship provides:
297+
298+
- Backporting security and bug fixes to the 4.x branch
299+
- Compatibility maintenance with older Java versions
300+
- Timely releases for critical issues
301+
302+
Without sponsorship, the 4.x branch will not receive further updates.
303+
304+
304305
### Last CI build
305306
The latest builds are available from our
306307
[Jenkins CI build server](https://jenkins.wetator.org/job/HtmlUnit%20-%20Neko/ "HtmlUnit -Neko CI")
@@ -317,7 +318,7 @@ Add the dependency to your `pom.xml`:
317318
<dependency>
318319
<groupId>org.htmlunit</groupId>
319320
<artifactId>neko-htmlunit</artifactId>
320-
<version>4.22.0-SNAPSHOT</version>
321+
<version>5.1.0-SNAPSHOT</version>
321322
</dependency>
322323

323324
You have to add the sonatype-central snapshot repository to your pom `repositories` section also:
@@ -347,13 +348,71 @@ repositories {
347348
}
348349
// ...
349350
dependencies {
350-
implementation group: 'org.htmlunit', name: 'neko-htmlunit', version: '4.22.0-SNAPSHOT'
351+
implementation group: 'org.htmlunit', name: 'neko-htmlunit', version: '5.1.0-SNAPSHOT'
351352
// ...
352353
}
353354
```
354355

355356

356-
## Porting from 3.x to 4.x
357+
## Migrating from 4.x to 5.x
358+
359+
Version 5.0.0 is a major release. The changes below cover everything you need to update when upgrading from any 4.x release.
360+
361+
### Java version requirement
362+
363+
5.x requires **JDK 17 or higher**. Java 8 and Java 11 are no longer supported.
364+
If you cannot upgrade your JDK, stay on the [4.x branch](#legacy-support-jdk-8).
365+
366+
### HTMLElements thread-safety fix (since 4.17.0)
367+
368+
The shared `HTMLElements` instance no longer caches unknown elements because that cache was not thread-safe. If your code relied on the unknown-element cache, switch to `HTMLElementsWithCache` and create a new instance per parse run:
369+
370+
```java
371+
// 4.x — relied on shared cache in HTMLElements (not thread-safe)
372+
DOMParser parser = new DOMParser(HTMLDocumentImpl.class);
373+
374+
// 5.x — use a fresh HTMLElementsWithCache per parse run if you need the cache
375+
HTMLElementsProvider provider = new HTMLElementsWithCache();
376+
DOMParser parser = new DOMParser(HTMLDocumentImpl.class, provider);
377+
```
378+
379+
### HTMLScanner document handler is now required (since 4.12.0)
380+
381+
`HTMLScanner` now enforces that a document handler is set before parsing. The null check was moved to the setter, so passing `null` will throw immediately rather than failing silently mid-parse. Ensure you always call `setContentHandler` (SAXParser) or provide a handler via the `DOMParser` constructor before calling `parse()`.
382+
383+
### Tag name casing for auto-inserted tags (since 4.20.0 / 4.21.0)
384+
385+
Auto-inserted tags (e.g. `<html>`, `<head>`, `<body>`) are now consistently produced in **lowercase** by default. If your code compared tag names assuming uppercase auto-inserted tags, update those comparisons or set the `NAMES_ELEMS` property explicitly:
386+
387+
```java
388+
// Force uppercase if your code depends on it
389+
parser.setProperty(HTMLScanner.NAMES_ELEMS, "upper");
390+
391+
// Or update comparisons to be case-insensitive
392+
String name = element.getTagName().toLowerCase(Locale.ROOT);
393+
```
394+
395+
### EOF handling changed from exceptions to return codes (since 4.19.0)
396+
397+
EOF conditions inside the scanner are now signalled via return codes rather than exceptions. This is an internal change that should not affect typical parser usage, but if you have custom `HTMLScanner` subclasses that override scanning methods and catch internal exceptions for EOF detection, review those overrides.
398+
399+
### SAXParser factory added (since 4.15.0)
400+
401+
`NekoSAXParserFactory` was added as a standard `javax.xml.parsers.SAXParserFactory` entry point. If you were constructing `SAXParser` directly and want to align with the standard JAXP pattern going forward:
402+
403+
```java
404+
// New in 4.15 / available in 5.x
405+
SAXParserFactory factory = new NekoSAXParserFactory();
406+
javax.xml.parsers.SAXParser saxParser = factory.newSAXParser();
407+
```
408+
409+
### Removed and unsupported features
410+
411+
- **`XMLLocator`** was simplified and some internal fields removed in 4.5.0. Custom subclasses that accessed internal locator fields will need to be updated.
412+
- **`XMLAttributesImpl`** was simplified in 4.5.0; direct field access patterns in custom subclasses should be replaced with the public API.
413+
414+
415+
## Migrating from 3.x to 4.x
357416

358417
Version 4.x introduces a major change in the handling of encodings - the mapping from the encoding
359418
label found in the meta tag to the encoding to be used for parsing the document got some significant
@@ -369,7 +428,7 @@ For this also
369428
encoding translator if you like to have the old translation behavior (parser.setProperty(HTMLScanner.ENCODING_TRANSLATOR, EncodingMap.INSTANCE))
370429

371430

372-
## Porting from 2.x to 3.x
431+
## Migrating from 2.x to 3.x
373432

374433
Usually the upgrade should be simple:
375434

@@ -443,18 +502,18 @@ This part is intended for committer who are packaging a release.
443502
* Create the version on Github
444503
* login to Github and open project https://github.com/HtmlUnit/htmlunit-neko
445504
* click Releases > Draft new release
446-
* fill the tag and title field with the release number (e.g. 4.0.0)
505+
* fill the tag and title field with the release number (e.g. 5.0.0)
447506
* append
448-
* neko-htmlunit-4.xx.jar
449-
* neko-htmlunit-4.xx.jar.asc
450-
* neko-htmlunit-4.xx.pom
451-
* neko-htmlunit-4.xx.pom.asc
452-
* neko-htmlunit-4.xx-javadoc.jar
453-
* neko-htmlunit-4.xx-javadoc.jar.asc
454-
* neko-htmlunit-4.xx-sources.jar
455-
* neko-htmlunit-4.xx-sources.jar.asc
456-
* neko-htmlunit-4.xx-tests.jar
457-
* neko-htmlunit-4.xx-tests.jar.asc
507+
* neko-htmlunit-5.xx.jar
508+
* neko-htmlunit-5.xx.jar.asc
509+
* neko-htmlunit-5.xx.pom
510+
* neko-htmlunit-5.xx.pom.asc
511+
* neko-htmlunit-5.xx-javadoc.jar
512+
* neko-htmlunit-5.xx-javadoc.jar.asc
513+
* neko-htmlunit-5.xx-sources.jar
514+
* neko-htmlunit-5.xx-sources.jar.asc
515+
* neko-htmlunit-5.xx-tests.jar
516+
* neko-htmlunit-5.xx-tests.jar.asc
458517
* and publish the release
459518

460519
* Update the version number in pom.xml to start next snapshot development

pom.xml

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
<modelVersion>4.0.0</modelVersion>
66
<groupId>org.htmlunit</groupId>
77
<artifactId>neko-htmlunit</artifactId>
8-
<version>5.0.0-SNAPSHOT</version>
8+
<version>5.0.0</version>
99
<name>HtmlUnit NekoHtml</name>
1010
<organization>
1111
<name>HtmlUnit</name>
@@ -27,27 +27,27 @@
2727
<maven.version.ignore>(?i).*-(alpha|beta|m|rc)([\.-]?\d+)?</maven.version.ignore>
2828

2929
<!-- test dependencies -->
30-
<junit.version>6.0.3</junit.version>
30+
<junit.version>6.1.0</junit.version>
3131

3232
<!-- quality -->
3333
<checkstyle.version>12.3.1</checkstyle.version>
3434
<spotbugs.version>4.9.8</spotbugs.version>
35-
<pmd.version>7.20.0</pmd.version>
35+
<pmd.version>7.24.0</pmd.version>
3636
<dependencycheck.version>10.0.4</dependencycheck.version>
3737

3838
<!-- plugins -->
3939
<central-publishing-plugin.version>0.10.0</central-publishing-plugin.version>
4040
<build-helper-plugin.version>3.6.1</build-helper-plugin.version>
4141
<checkstyle-plugin.version>3.6.0</checkstyle-plugin.version>
4242
<pmd-plugin.version>3.28.0</pmd-plugin.version>
43-
<spotbugs-plugin.version>4.9.8.2</spotbugs-plugin.version>
43+
<spotbugs-plugin.version>4.9.8.3</spotbugs-plugin.version>
4444
<gpg-plugin.version>3.2.8</gpg-plugin.version>
45-
<enforcer-plugin.version>3.6.2</enforcer-plugin.version>
46-
<compiler-plugin.version>3.14.1</compiler-plugin.version>
45+
<enforcer-plugin.version>3.6.3</enforcer-plugin.version>
46+
<compiler-plugin.version>3.15.0</compiler-plugin.version>
4747
<jar-plugin.version>3.5.0</jar-plugin.version>
4848
<source-plugin.version>3.4.0</source-plugin.version>
4949
<javadoc-plugin.version>3.12.0</javadoc-plugin.version>
50-
<surefire-plugin.version>3.5.4</surefire-plugin.version>
50+
<surefire-plugin.version>3.5.5</surefire-plugin.version>
5151
</properties>
5252

5353
<dependencies>

0 commit comments

Comments
 (0)