Skip to content

Commit b7944ab

Browse files
authored
Merge pull request #132 from UBC-MDS/doc_fix/add_tools_contributing.md
Update Contributing.md + Fix optimize_categorical.py
2 parents fc8ce73 + 17f143e commit b7944ab

2 files changed

Lines changed: 33 additions & 1 deletion

File tree

CONTRIBUTING.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,3 +104,35 @@ Before you submit a pull request, check that it meets these guidelines:
104104
new functionality into a function with a docstring.
105105
3. Your pull request will automatically be checked by the full test suite.
106106
It needs to pass all of them before it can be considered for merging.
107+
108+
## Development Tools and Practices
109+
110+
The current project applies modern software tools and organizational practices to ensure quality, reproducibility and effective collaboration between each member of the team.
111+
112+
### Used Tools and Infrastructures
113+
114+
- **GitHub** was used as main tool for version control and communication. In order to reduce errors, branch-method and Pull Requests (PR) were created effectively.
115+
116+
- **GitHub Issues and Project Boards** managed the division of the tasks, ensuring an even distribution of the workload and tracking of the milestones projects.
117+
118+
- **Continuous Integration (CI)** was constantly implemented, running tests automatically and ensuring a correct functionality of the new branches before merging in the main.
119+
120+
- **pytest** automated testing helped validate the functionality of the functions
121+
122+
- **Environment Management** was ensured through `environment.yml` to ensure reproducibility across development environments
123+
124+
- **Documentation** was maintained using Quarto files in order to provide a clear guide of usage for future users.
125+
126+
- **Netfly** was used for the Milestone 4, to automatically deploy the project documentation website: whenever a change was pushed to the repository it triggered a new site build, ensuring a synchronized documentation.
127+
128+
- **Gitflow Workflow** principles were applied to structure development, improving code stability and supported parallel development.
129+
130+
### Organizational Practices
131+
132+
- The collaborators demonstrate a consistent usage of **branching** strategy that ensured a clear and well managed workflow. Before merging into `main`, at least one collaborator is required to review the PR and provide a constructive feedback or suggestion whenever needed.
133+
134+
- Clear guidelines of the code of conduct support and shape a clear collaboration.
135+
136+
### Scaling the Project
137+
138+
If this project were scaled to a larger or production-level application, additional tools and practices would be required. These include stronger code reviews, more tests, versioned releases, and better dependency management. Automated deployment and CI/CD pipelines would help maintain reliability as the project grows.

src/group_32/optimize_categorical.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ def optimize_categorical(df: pd.DataFrame, max_unique_ratio: float = 0.5) -> pd.
9696
n_col = df_copy[col]
9797

9898
if n_col.isnull().all(): #if the column is empty, terminate the current loop
99-
break
99+
continue
100100

101101
n_unique = n_col.nunique(dropna=False)
102102
ratio = n_unique / n_rows

0 commit comments

Comments
 (0)