Skip to content

Register pandas dataframe memory issues #1585

Open
@byronverz

Description

@byronverz

When using the register_pandas_dataframe() method as suggested by the tip here, I get a system memory error:

Error Code: ScriptExecution.ReadDataFrame.Unexpected 
Failed Step: 1d7c20d4-6b70-4075-8b68-74f51264bf5b 
Error Message: ScriptExecutionException was caused by ReadDataFrameException. Unexpected exception during ReadDataframeFromSocket. 
Failed to read DataFrame from host. Exception of type 'System.OutOfMemoryException' was thrown.

When I run the same method with the same dataframe from a jupyter notebook it works as expected. I have ensured I have enough memory on my system when running the script (1.42 MiB dataframe and 6GB RAM free) so I don't think that's an issue. I know this method is experimental, so I am using the older method and it's working fine.

Another issue that sometimes happens (if it's not the system memory issue) is a streamAccessValidation error

azureml.dataprep.api.errorhandlers.ExecutionError: 
Error Code: ScriptExecution.ReadDataFrame.StreamAccess.Validation
Validation Error Code: Invalid
Validation Target: PreppyFile
Failed Step: bfd9a1d5-c01f-485a-8761-99cd0a41d0c3
Error Message: ScriptExecutionException was caused by ReadDataFrameException.
  Failed to read Pandas DataFrame form Python host. Make sure Dataflow is created directly from the source Pandas DataFrame.
    StreamAccessException was caused by ValidationException.
      Trying to read an invalid file. Missing sentinel value in the beginning
| session_id=710cc9c3-4478-4be9-998c-0e4a009800f5

Again, this does not happen when using the method from a jupyter notebook, only when running a script on my local machine.


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ADOIssue is documented on MSFT ADO for internal trackingData4MLproduct-issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions