Skip to content

Commit 6cba0fe

Browse files
author
Jose J. Martinez
committed
Merge remote-tracking branch 'origin/master'
2 parents 294c549 + b103fa6 commit 6cba0fe

File tree

1,857 files changed

+198661
-84518
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,857 files changed

+198661
-84518
lines changed

.gitattributes

+1
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
python/example/* linguist-vendored
22
*.ipynb linguist-vendored
3+
examples/* linguist-vendored

.github/ISSUE_TEMPLATE/bug_report.md

-47
This file was deleted.

.github/ISSUE_TEMPLATE/bug_report.yml

+105
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
name: Bug report
2+
description: File a bug/issue to help us improve Spark NLP. Thank you for contributing!
3+
labels: [bug]
4+
assignees: "maziyarpanahi"
5+
6+
body:
7+
- type: checkboxes
8+
attributes:
9+
label: Is there an existing issue for this?
10+
description: Please search to see if an issue already exists for the bug you encountered.
11+
options:
12+
- label: I have searched the existing issues and did not find a match.
13+
required: true
14+
- type: textarea
15+
attributes:
16+
label: Who can help?
17+
description: |
18+
Your issue will be processed faster, if you can tag the right person for it.
19+
If you know how to use `git blame`, then you can also tag the person directly.
20+
Otherwise we will get the right person to help you.
21+
- type: textarea
22+
attributes:
23+
label: What are you working on?
24+
description: |
25+
A brief description on the context of the issue. Is it an official example?
26+
Is it a published or custom task/dataset (GLUE/SQuAD, etc.)?
27+
validations:
28+
required: true
29+
- type: textarea
30+
attributes:
31+
label: Current Behavior
32+
description: A concise description of what you're experiencing.
33+
validations:
34+
required: true
35+
- type: textarea
36+
attributes:
37+
label: Expected Behavior
38+
description: A concise description of what you expected to happen.
39+
validations:
40+
required: true
41+
- type: textarea
42+
attributes:
43+
label: Steps To Reproduce
44+
description: |
45+
Please provide information on how to reproduce the issue. This could be a link to
46+
Google Colab or Databricks or any other notebook. Alternatively, it can be a
47+
pipeline (that is formatted in Markdown).
48+
If you have any error logs and stack traces, attach them here as well.
49+
placeholder: |
50+
A link to an end-to-end Colab/Jupyter notebook such as https://colab.research.google.com/...
51+
or a full pipeline code snippet:
52+
53+
```python
54+
import sparknlp
55+
...
56+
```
57+
validations:
58+
required: true
59+
- type: markdown
60+
attributes:
61+
value: |
62+
## Environment
63+
Please provide us with information about your environment. If you can provide more information for us, we can resolve the issue faster.
64+
- type: textarea
65+
attributes:
66+
label: Spark NLP version and Apache Spark
67+
description: Result of `sparknlp.version() and spark.version`
68+
placeholder: |
69+
import sparknlp
70+
sparknlp.version()
71+
spark.version
72+
validations:
73+
required: true
74+
- type: dropdown
75+
attributes:
76+
label: Type of Spark Application
77+
multiple: true
78+
options: ["spark-shell", "spark-submit", "Scala Application", "Python Appliation", "Java Application"]
79+
- type: input
80+
attributes:
81+
label: Java Version
82+
description: Result of `java -version`
83+
- type: input
84+
attributes:
85+
label: Java Home Directory
86+
description: Result of `echo $JAVA_HOME` or `JAVA_HOME` environment variable for windows
87+
- type: input
88+
attributes:
89+
label: Setup and installation
90+
description: How you set up Spark NLP, e.g. Pypi, Conda, Maven, sbt, etc.
91+
- type: input
92+
attributes:
93+
label: Operating System and Version
94+
- type: input
95+
attributes:
96+
label: Link to your project (if available)
97+
- type: textarea
98+
attributes:
99+
label: Additional Information
100+
description: |
101+
Links? References? Anything that will give us more context about the issue you are encountering.
102+
103+
Tip: You can attach files by clicking this area to highlight it and then dragging them in.
104+
validations:
105+
required: false

.github/ISSUE_TEMPLATE/config.yml

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
blank_issues_enabled: true
2+
3+
contact_links:
4+
- name: Converting models from other libraries? Visit the Discussion to see which are compatible.
5+
url: https://github.com/JohnSnowLabs/spark-nlp/discussions/5669
6+
about: Discussion about importing models from other libraries
7+
- name: Want to contribute a model? Visit the NLP Models Hub to upload your model.
8+
url: https://nlp.johnsnowlabs.com/models
9+
about: A place for sharing and discovering Spark NLP models and pipelines

.github/ISSUE_TEMPLATE/doc_improvement.md

-14
This file was deleted.
+14
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
name: Documentation Improvement
2+
description: File an issue to suggest edits to the documentation.
3+
labels: [documentation]
4+
assignees: "DevinTDHa"
5+
6+
body:
7+
- type: textarea
8+
attributes:
9+
label: Link to the documentation pages (if available)
10+
- type: textarea
11+
attributes:
12+
label: How could the documentation be improved?
13+
validations:
14+
required: true

.github/ISSUE_TEMPLATE/feature_request.md

-20
This file was deleted.
+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
name: "Feature Request"
2+
description: Suggest a new Spark NLP feature
3+
labels: [ "Feature request" ]
4+
assignees: "maziyarpanahi"
5+
6+
body:
7+
- type: textarea
8+
id: description
9+
validations:
10+
required: true
11+
attributes:
12+
label: Description
13+
description: |
14+
Is your feature request related to a problem? Why do you want this feature?
15+
Please provide a clear and concise description of what the problem is. Also link GitHub issues when applicable.
16+
placeholder: |
17+
I have this problem I want to solve in Spark NLP, but ...
18+
Implementing this new feature would help ...
19+
- type: textarea
20+
id: solution
21+
validations:
22+
required: true
23+
attributes:
24+
label: Preferred Solution
25+
description: |
26+
A clear and concise description of what you want to happen.
27+
- type: textarea
28+
id: additional-context
29+
validations:
30+
required: true
31+
attributes:
32+
label: Additional Context
33+
description: |
34+
Add any other context or screenshots about the feature request here.

.github/workflows/build_and_test.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ jobs:
5454
- name: Install Python packages (Python 3.7)
5555
run: |
5656
python -m pip install --upgrade pip
57-
pip install pyspark==3.3.0 numpy pytest
57+
pip install pyspark==3.3.1 numpy pytest
5858
- name: Build Spark NLP on Apache Spark 3.3.0
5959
run: |
6060
brew install sbt

CHANGELOG

+27
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,30 @@
1+
========
2+
4.3.0
3+
========
4+
----------------
5+
New Features
6+
----------------
7+
* Implement HubertForCTC annotator for automatic speech recognition
8+
* Implement SwinForImageClassification annotator for Image Classification
9+
* Introducing CamemBERT for Question Answering annotator
10+
* Implement ZeroShotNerModel annotator for zero-shot NER baed on RoBERTa architecture
11+
* Implement Date2Chunk annotator
12+
* Enable params argument in spark_nlp start() function
13+
* Allow doc_id reading CoNLL file datasets
14+
15+
----------------
16+
Bug Fixes & Enhancements
17+
----------------
18+
* Relocating all notebooks back to examples directory
19+
* Improve download/loading models & pipelines from AWS and GCP. When setting `cache_pretrained` directory to AWS and GCP will avoid copying existing models/pipelines
20+
* Improve GitHub templates for Bug reports, documentation, and feature request
21+
* Add documentation to ResourceDownloader
22+
* Refactor `ml` package to allow another DL engine in future
23+
* Apache Spark 3.3.1 is now the base version of Spark NLP
24+
* Spark NLP supports M2 in addition to M1. Therefore, we are renaming `spark-nlp-m1` to `spark-nlp-silicon` on Maven
25+
* Fix calculating delimiter id in CamemBERT
26+
* Fix loadSavedModel for private buckets
27+
128
========
229
4.2.8
330
========

0 commit comments

Comments
 (0)