3기_4주차_김수임 by softee220 · Pull Request #4 · HateSlop/3-huggingface

softee220 · 2025-10-04T05:35:09Z

Imdb 데이터 + 감정분석모델

Summary by CodeRabbit

Style
- Notebook cells now open expanded by default, improving readability and reducing extra clicks.
- Previously collapsed cells are immediately visible, simplifying review and step-by-step execution.
- No changes to computation, outputs, or error behavior; this update only affects presentation.
- Applied consistently across multiple cells in the notebook for a more predictable viewing experience.

coderabbitai · 2025-10-04T05:35:54Z

Walkthrough

Removed cell metadata flags in huggingface_basics.ipynb (UI-only change). Added a new Jupyter notebook softee220/huggingface_assignment.ipynb implementing a sentiment/emotion analysis workflow: environment setup, IMDB data sampling, a transformer text-classification pipeline, an emotion-normalizing helper, dataset mapping, and result curation.

Changes

Cohort / File(s)	Summary
Notebook metadata normalization `huggingface_basics.ipynb`	Removed `"collapsed": true` entries from multiple notebook cell metadata fields; no code, logic, or execution changes.
New sentiment/emotion analysis notebook `softee220/huggingface_assignment.ipynb`	Added a new workflow notebook that installs deps, loads the Stanford IMDB dataset, samples 200 examples, loads a text-classification pipeline (`SamLowe/roberta-base-go_emotions`), defines `analyze_emotion` to normalize pipeline outputs, maps annotations across the subset, and prepares DataFrame results (optional CSV export commented). No public API or exported signature changes.

Sequence Diagram(s)

sequenceDiagram
  actor User
  participant Notebook as "Jupyter Notebook\n(softee220/huggingface_assignment.ipynb)"
  participant Datasets as "datasets (IMDB)"
  participant Pipeline as "Transformer Pipeline\n(SamLowe/roberta-base-go_emotions)"
  participant Results as "DataFrame / CSV"

  User ->> Notebook: open & run cells
  Notebook ->> Datasets: load IMDB dataset (train split)
  Datasets -->> Notebook: dataset object / subset (200)
  Notebook ->> Pipeline: load text-classification pipeline
  Notebook ->> Pipeline: analyze_emotion(text) [map over subset]
  Pipeline -->> Notebook: labels + scores (varied shapes)
  Notebook ->> Notebook: normalize outputs into {label, score}
  Notebook ->> Results: assemble DataFrame (text, emotion)
  Notebook ->> User: display results (optional save)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

In a tidy patch I nibble code and hay,
I opened sleepy cells so they greet the day.
I sampled tales from silver-screened review,
I nudged emotions into labels true.
Hop, hop—short review, then off I play. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title Check	❓ Inconclusive	The title “3기_4주차_김수임” does not describe the substantive changes in the pull request and reads like an assignment identifier rather than summarizing the key modifications, making it too vague for a reviewer to understand the main updates at a glance.	Please update the title to a concise, descriptive summary of the changes, for example “Remove collapsed cell metadata in huggingface_basics and add IMDB sentiment analysis notebook” so that reviewers can immediately grasp the PR’s purpose.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

softee220/huggingface_assignment.ipynb (1)
67-75: Import display explicitly.

display() isn’t defined in a plain Python runtime, and static analysis already flags it. Import it alongside the other notebook utilities to keep the code portable and lint-clean.
-from datasets import load_dataset, DatasetDict
-from transformers import pipeline
-import torch
-import pandas as pd # 데이터 확인용
+from datasets import load_dataset, DatasetDict
+from transformers import pipeline
+import torch
+import pandas as pd # 데이터 확인용
+from IPython.display import display

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 962235f and fab7daa.

📒 Files selected for processing (1)

softee220/huggingface_assignment.ipynb (1 hunks)

🧰 Additional context used

🪛 Ruff (0.13.3)

softee220/huggingface_assignment.ipynb

43-43: Undefined name display

(F821)

93-93: Undefined name display

(F821)

99-99: Undefined name display

(F821)

coderabbitai · 2025-10-05T18:04:47Z

+    "!pip install transformers datasets sentencepiece accelerate torch\n",
+    "\n",
+    "import datasets\n",
+    "from datasets import load_dataset, DatasetDict\n",
+    "from transformers import pipeline\n",
+    "import torch\n",
+    "import pandas as pd # 데이터 확인용\n",
+    "\n",
+    "# GPU 사용 가능 여부 확인 및 설정 (Colab에서는 보통 GPU 사용 가능)\n",
+    "device = 0 if torch.cuda.is_available() else -1\n",
+    "print(f\"사용 가능한 디바이스: {'GPU' if device == 0 else 'CPU'}\")"


⚠️ Potential issue | 🟠 Major

Avoid downgrading Colab’s CUDA-enabled torch.

Running pip install torch in Colab removes the preinstalled GPU build and pulls a CPU-only wheel, so torch.cuda.is_available() flips to False and all inference falls back to CPU. Drop the plain torch install (or replace it with a CUDA-specific wheel) to preserve GPU acceleration.

-!pip install transformers datasets sentencepiece accelerate torch +!pip install transformers datasets sentencepiece accelerate

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"!pip install transformers datasets sentencepiece accelerate torch\n",

"\n",

"import datasets\n",

"from datasets import load_dataset, DatasetDict\n",

"from transformers import pipeline\n",

"import torch\n",

"import pandas as pd # 데이터 확인용\n",

"\n",

"# GPU 사용 가능 여부 확인 및 설정 (Colab에서는 보통 GPU 사용 가능)\n",

"device = 0 if torch.cuda.is_available() else -1\n",

"print(f\"사용 가능한 디바이스: {'GPU' if device == 0 else 'CPU'}\")"

!pip install transformers datasets sentencepiece accelerate

import datasets

from datasets import load_dataset, DatasetDict

from transformers import pipeline

import torch

import pandas as pd # 데이터 확인용

# GPU 사용 가능 여부 확인 및 설정 (Colab에서는 보통 GPU 사용 가능)

device = 0 if torch.cuda.is_available() else -1

print(f"사용 가능한 디바이스: {'GPU' if device == 0 else 'CPU'}")

🤖 Prompt for AI Agents

In softee220/huggingface_assignment.ipynb around lines 65 to 75, the notebook unconditionally runs a plain "pip install torch" which on Colab replaces the preinstalled CUDA-enabled torch with a CPU-only wheel and disables GPU; remove the plain "pip install torch" line (or replace it with a Colab-compatible CUDA-specific wheel only when needed) and leave the device detection code intact so torch.cuda.is_available() can correctly detect the GPU, or add a conditional install that skips torch reinstallation when a CUDA-enabled build is present.

25.10.04_first commit

ed13a8a

softee220 added 4 commits October 4, 2025 05:49

25.10.04_second commit

962235f

4주차 과제

a606fe3

4주차 과제-hugging face

859096d

4주차 과제-hugging face1

fab7daa

coderabbitai bot reviewed Oct 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3기_4주차_김수임#4

3기_4주차_김수임#4
softee220 wants to merge 5 commits intoHateSlop:mainfrom
softee220:softee220

softee220 commented Oct 4, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Oct 4, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

softee220 commented Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

softee220 commented Oct 4, 2025 •

edited

Loading

coderabbitai bot commented Oct 4, 2025 •

edited

Loading