Skip to content

Commit 64b29f6

Browse files
authored
Add docs for yaml dependencies. (apache#34345)
1 parent 5fe4bbc commit 64b29f6

File tree

3 files changed

+34
-0
lines changed

3 files changed

+34
-0
lines changed

CHANGES.md

+2
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,8 @@
111111
* [Java] Support for `--add-modules` JVM option is added through a new pipeline option `JdkAddRootModules`. This allows extending the module graph with optional modules such as SDK incubator modules. Sample usage: `<pipeline invocation> --jdkAddRootModules=jdk.incubator.vector` ([#30281](https://github.com/apache/beam/issues/30281)).
112112
* X feature added (Java/Python) ([#X](https://github.com/apache/beam/issues/X)).
113113
* Managed API for [Java](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/managed/Managed.html) and [Python](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.managed.html#module-apache_beam.transforms.managed) supports [key I/O connectors](https://beam.apache.org/documentation/io/connectors/) Iceberg, Kafka, and BigQuery.
114+
* [YAML] Beam YAML UDFs (such as those used in MapToFields) can now have declared dependencies
115+
(e.g. pypi packages for Python, or extra jars for Java).
114116
* Prism now supports event time triggers for most common cases. ([#31438](https://github.com/apache/beam/issues/31438))
115117
* Prism does not yet support triggered side inputs, or triggers on merging windows (such as session windows).
116118

sdks/python/apache_beam/yaml/readme_test.py

+2
Original file line numberDiff line numberDiff line change
@@ -299,6 +299,8 @@ def extract_name(input_spec):
299299
if code_lines:
300300
if code_lines[0].startswith('- type:'):
301301
specs = yaml.load('\n'.join(code_lines), Loader=SafeLoader)
302+
if 'dependencies:' in specs:
303+
test_type = 'PARSE'
302304
is_chain = not any('input' in spec for spec in specs)
303305
if is_chain:
304306
undefined_inputs = set(['input'])

website/www/site/content/en/documentation/sdks/yaml-udf.md

+30
Original file line numberDiff line numberDiff line change
@@ -508,3 +508,33 @@ a `{type: 'basic_type_name'}` nesting.
508508

509509
This can be especially useful to resolve errors involving the inability to
510510
handle the `beam:logical:pythonsdk_any:v1` type.
511+
512+
513+
## Dependencies
514+
515+
Often user defined functions need to rely on external dependencies.
516+
These can be provided with a `dependencies` attribute in the transform
517+
config. For example
518+
519+
```
520+
- type: MapToFields
521+
config:
522+
language: python
523+
dependencies:
524+
- 'scipy>=1.15'
525+
fields:
526+
new_col:
527+
callable: |
528+
import scipy.special
529+
530+
def func(t):
531+
return scipy.special.zeta(complex(1/2, t))
532+
```
533+
534+
The dependencies are interpreted according to the language, e.g.
535+
for Java one provides a list of maven identifiers and/or jars,
536+
and for Python one provides a list of pypi package specifiers and/or sdist tarballs.
537+
See also the full examples using
538+
[java dependencies](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/yaml/examples/transforms/elementwise/map_to_fields_with_java_deps.yaml)
539+
and
540+
[python dependencies](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/yaml/examples/transforms/elementwise/map_to_fields_with_deps.yaml).

0 commit comments

Comments
 (0)