Skip to content

Incompatibility with Modern numpy Versions and Installation Issues on Python 3.11 with Apple Silicon #358

@hosseini-rtr

Description

@hosseini-rtr

Description

When trying to use hazm with Python 3.11 on a macOS system with Apple Silicon (ARM64), I encountered issues due to its dependency on an outdated version of numpy (1.5.0). This version is incompatible with Python 3.11 and fails to install due to changes in distutils.sysconfig (missing _init_posix attribute). Additionally, newer versions of numpy (e.g., 1.26.4 or 2.2.5) seem to cause import errors, such as ModuleNotFoundError: No module named 'numpy.rec'.

Environment

  • Operating System: macOS
  • Architecture: Apple Silicon (ARM64)
  • Python Version: 3.11
  • Hazm Version: 0.10.0
  • numpy Version Tried: 1.5.0 (required by hazm), 1.26.4, 2.2.5
  • Other Relevant Packages: scikit-learn, pandas, gensim==4.3.3

Error Details

Attempt 1: Installing hazm with newer numpy (e.g., 2.2.5)

When importing hazm.Normalizer, I get the following error:
ModuleNotFoundError: No module named 'numpy.rec'

This seems to be because hazm or its dependencies (e.g., sklearn) expect an older numpy version.

Attempt 2: Installing numpy==1.5.0

When trying to install numpy==1.5.0 (as required by hazm), I get an installation error:
AttributeError: module 'distutils.sysconfig' has no attribute '_init_posix'

This is likely because numpy==1.5.0 is too old for Python 3.11 and does not support ARM64 architecture.

Steps to Reproduce

  1. Create a virtual environment with Python 3.11 on macOS (Apple Silicon).
  2. Run pip install hazm.
  3. Try importing from hazm import Normalizer (fails with numpy.rec error if numpy>=1.18 is installed).
  4. Alternatively, try pip install numpy==1.5.0 (fails with distutils error).

What I Tried

  • Installed newer numpy versions (1.26.4, 2.2.5), but got ModuleNotFoundError: No module named 'numpy.rec'.
  • Tried installing numpy==1.5.0, but it fails due to distutils incompatibility with Python 3.11.
  • Considered using alternative libraries like parsivar or virastar, which work fine with modern numpy versions.

Samle Code:

import pandas as pd
import torch
import transformers
import chromadb
from hazm import Normalizer
import numpy as np
import sklearn

print("pandas:", pd.__version__)
print("torch:", torch.__version__)
print("MPS available:", torch.backends.mps.is_available())  # باید True باشه
print("transformers:", transformers.__version__)
print("chromadb:", chromadb.__version__)
print("numpy:", np.__version__)
print("scikit-learn:", sklearn.__version__)
print("hazm:", Normalizer().normalize("تست نرمال‌سازی"))

Expected Behavior

hazm should be compatible with recent numpy versions (e.g., 1.26.4 or higher) and installable on Python 3.11 with Apple Silicon without requiring an outdated numpy==1.5.0.

Suggested Fix

  • Update hazm dependencies to support newer numpy versions (e.g., >=1.24).
  • Remove or update reliance on numpy.rec in hazm or its dependencies (e.g., sklearn).
  • Test and document compatibility with Python 3.11 and ARM64 architectures.

Additional Notes

I’m happy to provide more details or test any proposed fixes. This issue is critical for projects relying on hazm for Persian NLP on modern systems.

Thanks for maintaining this great library!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions