Skip to content

New package: EDA v1.0.0#122417

Closed
JuliaRegistrator wants to merge 1 commit intomasterfrom
registrator-eda-b2da3285-v1.0.0-66b5338f1f
Closed

New package: EDA v1.0.0#122417
JuliaRegistrator wants to merge 1 commit intomasterfrom
registrator-eda-b2da3285-v1.0.0-66b5338f1f

Conversation

@JuliaRegistrator
Copy link
Contributor

@JuliaRegistrator JuliaRegistrator commented Jan 5, 2025

  • Registering package: EDA
  • Repository: https://github.com/notGiGi/ExploratoryDataAnalysis
  • Created by: @notGiGi
  • Version: v1.0.0
  • Commit: df15d8d61cabab1d94c069aeed49f44470a64f6b
  • Reviewed by: @notGiGi
  • Reference: https://github.com/notGiGi/ExploratoryDataAnalysis/issues/8
  • Description: The EDA Pkg simplifies exploratory data analysis by offering tools for data visualization, cleaning, and preparation. It includes functions for handling missing values, detecting outliers, analyzing correlations, and generating insightful visualizations. Perfect for understanding and preparing datasets for advanced analysis or modeling.
  • Release notes:
EDA.jl

## Version 0.1.0

### Highlights
- Initial release of **EDA.jl**, a powerful and comprehensive Julia package for exploratory data analysis.

### Features
#### Data Cleaning and Handling
- `threshold`: Remove columns with missing data exceeding a user-defined percentage.
- `outlierswithiqr`: Detect outliers using the Interquartile Range (IQR) method.
- `outlierhandle`: Handle outliers with configurable options for removal.

#### Data Exploration
- `visualize_data`: Preview the first `n` rows of a dataset.
- `dataType`: Analyze and display data types for all columns.
- `correlation`: Compute correlation matrices for numeric columns.
- `heat`: Generate heatmaps of correlation matrices for easy interpretation.

#### Visualization
- `correlation_network`: Create an interactive graph of correlations between variables.
- Real vs. predicted plots for regression analysis.

#### Statistical Analysis
- `linearregression`: Fit linear regression models with detailed output of coefficients and predictions.

#### Efficiency
- All functionalities are encapsulated in the `EDALoader` structure, allowing for efficient data manipulation with built-in caching.

### Improvements and Fixes
- Optimized correlation and visualization functions to handle missing data gracefully.
- Added error handling and user-friendly messages for all major functions.
- Improved plotting aesthetics for enhanced data visualization.

### Planned Features for Future Releases
- Advanced time-series analysis tools:
  - Seasonal decomposition.
  - Trend detection.
  - Forecasting models (ARIMA, Prophet, etc.).
- Integration with PlotlyJS for interactive visualizations.
- Machine learning integration with Julia ML frameworks.

### Known Issues
- Large datasets may experience performance degradation in some visualization functions. Optimization is planned in upcoming versions.

### How to Upgrade
1. Open Julia's REPL.
2. Run the following commands:
   ```julia
   using Pkg
   Pkg.update("EDA")

@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2025

Hello, I am an automated registration bot. I help manage the registration process by checking your registration against a set of AutoMerge guidelines. If all these guidelines are met, this pull request will be merged automatically, completing your registration. It is strongly recommended to follow the guidelines, since otherwise the pull request needs to be manually reviewed and merged by a human.

1. New package registration

Please make sure that you have read the package naming guidelines.

2. AutoMerge Guidelines which are not met ❌

  • Name does not meet all of the following: starts with an upper-case letter, ASCII alphanumerics only, not all letters are upper-case.

  • Name is not at least 5 characters long

  • Repo URL does not end with /name.jl.git, where name is the package name

  • Package name similar to 18 existing packages.

    Similar package names
    1. Similar to EDF. Damerau-Levenshtein distance 1 is at or below cutoff of 2. Damerau-Levenshtein distance 1 between lowercased names is at or below cutoff of 1. Normalized visual distance 2.08 is at or below cutoff of 2.50.
    2. Similar to FCA. Damerau-Levenshtein distance 2 is at or below cutoff of 2. Normalized visual distance 1.68 is at or below cutoff of 2.50.
    3. Similar to FDM. Damerau-Levenshtein distance 2 is at or below cutoff of 2. Normalized visual distance 1.94 is at or below cutoff of 2.50.
    4. Similar to SHA. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    5. Similar to BDF. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    6. Similar to ECC. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    7. Similar to GDAL. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    8. Similar to ODE. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    9. Similar to BED. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    10. Similar to XPA. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    11. Similar to XDF. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    12. Similar to MD5. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    13. Similar to JDF. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    14. Similar to ADI. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    15. Similar to SDPA. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    16. Similar to ERFA. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    17. Similar to CUDA. Damerau-Levenshtein distance 2 is at or below cutoff of 2.
    18. Similar to VIDA. Damerau-Levenshtein distance 2 is at or below cutoff of 2.

3. Needs action: here's what to do next

  1. Please try to update your package to conform to these guidelines. The General registry's README has an FAQ that can help figure out how to do so.
  2. After you have fixed the AutoMerge issues, simply retrigger Registrator, the same way you did in the initial registration. This will automatically update this pull request. You do not need to change the version number in your Project.toml file (unless the AutoMerge issue is that you skipped a version number).

If you need help fixing the AutoMerge issues, or want your pull request to be manually merged instead, please post a comment explaining what you need help with or why you would like this pull request to be manually merged. Then, send a message to the #pkg-registration channel in the public Julia Slack for better visibility.

4. To pause or stop registration

If you want to prevent this pull request from being auto-merged, simply leave a comment. If you want to post a comment without blocking auto-merging, you must include the text [noblock] in your comment.

Tip: You can edit blocking comments to add [noblock] in order to unblock auto-merging.

@JuliaRegistrator JuliaRegistrator force-pushed the registrator-eda-b2da3285-v1.0.0-66b5338f1f branch from 3724966 to f6e5e74 Compare January 5, 2025 03:04
UUID: b2da3285-ee56-486b-95f3-9dc7b9d1c35a
Repo: https://github.com/notGiGi/ExploratoryDataAnalysis.git
Tree: eebff6650e5f038f8b7b6d6712bfac2ce1298a47

Registrator tree SHA: 17aec322677d9b81cdd6b9b9236b09a3f1374c6a
@JuliaRegistrator JuliaRegistrator force-pushed the registrator-eda-b2da3285-v1.0.0-66b5338f1f branch from f6e5e74 to 78c146f Compare January 5, 2025 03:30
@goerz
Copy link
Member

goerz commented Jan 5, 2025

Closing in favor of #122422

@goerz goerz closed this Jan 5, 2025
@giordano giordano deleted the registrator-eda-b2da3285-v1.0.0-66b5338f1f branch January 12, 2025 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants