Skip to content

Commit 2a0f845

Browse files
authored
Add functionality for retrieving callable definitions and converting them to callables (unitycatalog#912)
**PR Checklist** - [X] A description of the changes is added to the description of this PR. - [ ] If there is a related issue, make sure it is linked to this PR. - [ ] If you've fixed a bug or added code that should be tested, add tests! - [ ] If you've added or modified a feature, documentation in `docs` is updated **Description of changes** Extension of unitycatalog#911 , bringing a unified functionality across both clients. Adds a utility for converting the string representation to an actual callable for local execution. --------- Signed-off-by: Ben Wilson <[email protected]>
1 parent dece8ec commit 2a0f845

14 files changed

+1391
-24
lines changed

ai/core/README.md

+132
Original file line numberDiff line numberDiff line change
@@ -298,6 +298,65 @@ full_func_name = f"{CATALOG}.{SCHEMA}.{func_name}"
298298
client.get_function(full_func_name)
299299
```
300300
301+
#### Retrieving a UC function callable
302+
303+
The `get_function_source` API is used to retrieve a recreated python callable definition (as a string) from a registered Unity Catalog Python function.
304+
In order to use this API, the function that you are fetching **must be** an `EXTERNAL` (python function) type function. When called, the function's metadata will
305+
be retrieved and the structure of the original callable will be rebuilt and returned as a string.
306+
307+
For example:
308+
309+
```python
310+
# Define a python callable
311+
312+
def sample_python_func(a: int, b: int) -> int:
313+
"""
314+
Returns the sum of a and b.
315+
316+
Args:
317+
a: an int
318+
b: another int
319+
320+
Returns:
321+
The sum of a and b
322+
"""
323+
return a + b
324+
325+
# Create the function within Unity Catalog
326+
client.create_python_function(catalog=CATALOG, schema=SCHEMA, func=sample_python_func, replace=True)
327+
328+
# Fetch the callable definition
329+
my_func_def = client.get_function_source(function_name=f"{CATALOG}.{SCHEMA}.sample_python_function")
330+
```
331+
332+
The returned value from the `get_function_source` API will be the same as the original input with a few caveats:
333+
334+
- `tuple` types will be cast to `list` due to the inability to express a Python `tuple` within Unity Catalog
335+
- The docstring of the original function will be stripped out. Unity Catalog persists the docstring information in the logged function and it is available in the return of the `get_function` API call if needed.
336+
- Collection types for open source Unity Catalog will only capture the outer type (i.e., `list` or `dict`) as inner collection type metadata is not preserved
337+
within the `FunctionInfo` object. In Databricks, full typing is supported for collecitons.
338+
339+
The result of calling the `get_function_source` API on the `sample_python_func` registered function will be (when printed):
340+
341+
```text
342+
def sample_python_func(a: int, b: int) -> int:
343+
"""
344+
Returns the sum of a and b.
345+
346+
Args:
347+
a: an int
348+
b: another int
349+
350+
Returns:
351+
int
352+
"""
353+
return a + b
354+
```
355+
356+
Note: If you want to convert the extracted string back into an actual Python callable, you can use the utility `load_function_from_string` in the module `unitycatalog.ai.core.utils.execution_utils`. See below for further details on this API.
357+
358+
This API is useful for extracting already-registered functions that will be used as additional in-line calls within another function through the use of the `create_wrapped_python_function` API, saving the effort required to either hand-craft a function definition or having to track down where the original implementation of a logged function was defined.
359+
301360
#### List UC functions
302361

303362
To get a list of functions stored in a catalog and schema, you can use list API with wildcards to do so.
@@ -315,6 +374,79 @@ result = client.execute_function(full_func_name, {"s": "some_string"})
315374
assert result.value == "some_string"
316375
```
317376

377+
#### Execute a UC Python function locally
378+
379+
A utility `load_function_from_string` is available in `unitycatalog.ai.core.utils.execution_utils.py`. This utility allows you to couple the functionality
380+
in the `get_function_source` API to create a locally-available python callable that can be direclty accessed, precisely as if it were originally defined
381+
within your current REPL.
382+
383+
```python
384+
from unitycatalog.ai.core.utils.execution_utils import load_function_from_string
385+
386+
func_str = """
387+
def multiply_numbers(a: int, b: int) -> int:
388+
\"\"\"
389+
Multiplies two numbers.
390+
391+
Args:
392+
a: first number.
393+
b: second number.
394+
395+
Returns:
396+
int
397+
\"\"\"
398+
return a * b
399+
"""
400+
401+
# If specifying `register_global=False`, the original function name cannot be called and must be used
402+
# with the returned callable reference.
403+
my_new_multiplier = load_function_from_string(func_str, register_global=False)
404+
my_new_multiplier(a=1, b=2) # returns `2`
405+
406+
# Alternatively, if allowing for global reference `register_global=True` (default)
407+
# The original callable name can be used. This will not work in interactive environments like Jupyter.
408+
load_function_from_string(func_str)
409+
multiply_numbers(a=2, b=2) # returns `4`
410+
411+
# For interactive environments, setting the return object directly within globals() is required in order
412+
# to utilize the original function name
413+
alias = load_function_from_string(func_str)
414+
globals()["multiply_numbers"] = alias
415+
multiply_numbers(a=3, b=3) # returns `9`
416+
417+
# Additionally, a scoped namespace can be provided to restrict scope and access to scoped arguments
418+
from types import SimpleNamespace
419+
420+
func_str2 = """
421+
def multiply_numbers_with_constant(a: int, b: int) -> int:
422+
\"\"\"
423+
Multiplies two numbers with a constant.
424+
425+
Args:
426+
a: first number.
427+
b: second number.
428+
429+
Returns:
430+
int
431+
\"\"\"
432+
return a * b * c
433+
"""
434+
435+
c = 100 # Not part of the scoped namespace; local constant
436+
437+
scoped_namespace = {
438+
"__builtins__": __builtins__,
439+
"c": 42,
440+
}
441+
442+
load_function_from_string(func_str, register_function=True, namespace=scoped_namespace)
443+
444+
scoped_ns = SimpleNamespace(**scoped_namespace)
445+
446+
scoped_ns.multiply_numbers_with_constant(a=2, b=3) # returns 252, utilizing the `c` constant of the namespace
447+
448+
```
449+
318450
##### Function execution arguments configuration
319451

320452
To manage the function execution behavior using Databricks client under different configurations, we offer the following environment variables:

ai/core/src/unitycatalog/ai/core/base.py

+14
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,20 @@ def to_dict(self):
222222
Sensitive information should be excluded.
223223
"""
224224

225+
@abstractmethod
226+
def get_function_source(self, function_name: str) -> str:
227+
"""
228+
Get the Python callable definition reconstructed from Unity Catalog
229+
for a function by its name. The return of this method is a string
230+
that contains the callable's definition.
231+
232+
Args:
233+
function_name: The name of the function to retrieve from Unity Catalog.
234+
235+
Returns:
236+
str: The Python callable definition as a string.
237+
"""
238+
225239

226240
# TODO: update BaseFunctionClient to Union[BaseFunctionClient, AsyncBaseFunctionClient] after async client is supported
227241
def get_uc_function_client() -> Optional[BaseFunctionClient]:

ai/core/src/unitycatalog/ai/core/client.py

+51-22
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@
1212

1313
from unitycatalog.ai.core.base import BaseFunctionClient, FunctionExecutionResult
1414
from unitycatalog.ai.core.paged_list import PagedList
15+
from unitycatalog.ai.core.utils.callable_utils import (
16+
dynamically_construct_python_function,
17+
)
1518
from unitycatalog.ai.core.utils.callable_utils_oss import (
1619
generate_function_info,
1720
generate_wrapped_function_info,
@@ -845,7 +848,7 @@ def _execute_uc_function(
845848
except Exception as e:
846849
return FunctionExecutionResult(error=str(e))
847850
else:
848-
python_function = dynamically_construct_python_function(function_info)
851+
python_function = _get_callable_definition(function_info)
849852
exec(python_function, self.func_cache)
850853
try:
851854
func = self.func_cache[function_info.name]
@@ -907,30 +910,28 @@ def to_dict(self) -> Dict[str, Any]:
907910
elements = ["uc"]
908911
return {k: getattr(self, k) for k in elements if getattr(self, k) is not None}
909912

913+
@override
914+
def get_function_source(self, function_name: str) -> str:
915+
"""
916+
Returns the Python callable definition as a string for an EXTERNAL Python function that
917+
is stored within Unity Catalog. This function can only parse and extract the full callable
918+
definition for Python functions and cannot be used on SQL or TABLE functions.
910919
911-
def dynamically_construct_python_function(function_info: FunctionInfo) -> str:
912-
"""
913-
Construct a Python function from the given FunctionInfo.
914-
915-
Args:
916-
function_info: The FunctionInfo object containing the function metadata.
917-
918-
Returns:
919-
The re-constructed function definition.
920-
"""
920+
NOTE: To unify the behavior of creating a valid Python callable, existing indentation in the
921+
stored function body will be unified to a consistent indentation level of `4` spaces.
921922
922-
param_names = []
923-
if function_info.input_params and function_info.input_params.parameters:
924-
param_names = [param.name for param in function_info.input_params.parameters]
925-
function_head = f"{function_info.name}({', '.join(param_names)})"
926-
func_def = f"def {function_head}:\n"
927-
if function_info.routine_body == "EXTERNAL":
928-
for line in function_info.routine_definition.split("\n"):
929-
func_def += f" {line}\n"
930-
else:
931-
raise NotImplementedError(f"routine_body {function_info.routine_body} not supported")
923+
Args:
924+
function_name: The name of the function to retrieve the Python callable definition for.
932925
933-
return func_def
926+
Returns:
927+
str: The Python callable definition as a string.
928+
"""
929+
function_info = self.get_function(function_name)
930+
if function_info.routine_body != "EXTERNAL":
931+
raise ValueError(
932+
f"Function {function_name} is not an EXTERNAL Python function and cannot be retrieved."
933+
)
934+
return dynamically_construct_python_function(function_info=function_info)
934935

935936

936937
def validate_input_parameter(
@@ -1002,3 +1003,31 @@ def validate_param(param: Any, column_type: str, param_type_text: str) -> None:
10021003
f"Invalid interval type text: {param_type_text}, expecting 'interval day to second', "
10031004
"python timedelta can only be used for day-time interval."
10041005
)
1006+
1007+
1008+
def _get_callable_definition(function_info: FunctionInfo) -> str:
1009+
"""
1010+
Construct a Python function from the given FunctionInfo without docstring, comments, or types.
1011+
This funciton is purely used for local function execution encapsulated within a call to
1012+
`execute_function` and is not intended to be used for retrieving a callable definition.
1013+
Use `get_python_callable` instead.
1014+
1015+
Args:
1016+
function_info: The FunctionInfo object containing the function metadata.
1017+
1018+
Returns:
1019+
The minimal re-constructed function definition.
1020+
"""
1021+
1022+
param_names = []
1023+
if function_info.input_params and function_info.input_params.parameters:
1024+
param_names = [param.name for param in function_info.input_params.parameters]
1025+
function_head = f"{function_info.name}({', '.join(param_names)})"
1026+
func_def = f"def {function_head}:\n"
1027+
if function_info.routine_body == "EXTERNAL":
1028+
for line in function_info.routine_definition.split("\n"):
1029+
func_def += f" {line}\n"
1030+
else:
1031+
raise NotImplementedError(f"routine_body {function_info.routine_body} not supported")
1032+
1033+
return func_def

ai/core/src/unitycatalog/ai/core/databricks.py

+22
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
from unitycatalog.ai.core.paged_list import PagedList
2121
from unitycatalog.ai.core.types import Variant
2222
from unitycatalog.ai.core.utils.callable_utils import (
23+
dynamically_construct_python_function,
2324
generate_sql_function_body,
2425
generate_wrapped_sql_function_body,
2526
)
@@ -742,6 +743,27 @@ def from_dict(cls, config: Dict[str, Any]):
742743
accept_keys = ["profile"]
743744
return cls(**{k: v for k, v in config.items() if k in accept_keys})
744745

746+
@override
747+
def get_function_source(self, function_name: str) -> str:
748+
"""
749+
Returns the Python callable definition as a string for an EXTERNAL Python function that
750+
is stored within Unity Catalog. This function can only parse and extract the full callable
751+
definition for Python functions and cannot be used on SQL or TABLE functions.
752+
753+
Args:
754+
function_name: The name of the function to retrieve the Python callable definition for.
755+
756+
Returns:
757+
str: The Python callable definition as a string.
758+
"""
759+
760+
function_info = self.get_function(function_name)
761+
if function_info.routine_body.value != "EXTERNAL":
762+
raise ValueError(
763+
f"Function {function_name} is not an EXTERNAL Python function and cannot be retrieved."
764+
)
765+
return dynamically_construct_python_function(function_info)
766+
745767

746768
def is_scalar(function: "FunctionInfo") -> bool:
747769
from databricks.sdk.service.catalog import ColumnTypeName

0 commit comments

Comments
 (0)