Skip to content

3기_4주차_김수임#4

Open
softee220 wants to merge 5 commits intoHateSlop:mainfrom
softee220:softee220
Open

3기_4주차_김수임#4
softee220 wants to merge 5 commits intoHateSlop:mainfrom
softee220:softee220

Conversation

@softee220
Copy link
Copy Markdown

@softee220 softee220 commented Oct 4, 2025

Imdb 데이터 + 감정분석모델

Summary by CodeRabbit

  • Style
    • Notebook cells now open expanded by default, improving readability and reducing extra clicks.
    • Previously collapsed cells are immediately visible, simplifying review and step-by-step execution.
    • No changes to computation, outputs, or error behavior; this update only affects presentation.
    • Applied consistently across multiple cells in the notebook for a more predictable viewing experience.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Oct 4, 2025

Walkthrough

Removed cell metadata flags in huggingface_basics.ipynb (UI-only change). Added a new Jupyter notebook softee220/huggingface_assignment.ipynb implementing a sentiment/emotion analysis workflow: environment setup, IMDB data sampling, a transformer text-classification pipeline, an emotion-normalizing helper, dataset mapping, and result curation.

Changes

Cohort / File(s) Summary
Notebook metadata normalization
huggingface_basics.ipynb
Removed "collapsed": true entries from multiple notebook cell metadata fields; no code, logic, or execution changes.
New sentiment/emotion analysis notebook
softee220/huggingface_assignment.ipynb
Added a new workflow notebook that installs deps, loads the Stanford IMDB dataset, samples 200 examples, loads a text-classification pipeline (SamLowe/roberta-base-go_emotions), defines analyze_emotion to normalize pipeline outputs, maps annotations across the subset, and prepares DataFrame results (optional CSV export commented). No public API or exported signature changes.

Sequence Diagram(s)

sequenceDiagram
  actor User
  participant Notebook as "Jupyter Notebook\n(softee220/huggingface_assignment.ipynb)"
  participant Datasets as "datasets (IMDB)"
  participant Pipeline as "Transformer Pipeline\n(SamLowe/roberta-base-go_emotions)"
  participant Results as "DataFrame / CSV"

  User ->> Notebook: open & run cells
  Notebook ->> Datasets: load IMDB dataset (train split)
  Datasets -->> Notebook: dataset object / subset (200)
  Notebook ->> Pipeline: load text-classification pipeline
  Notebook ->> Pipeline: analyze_emotion(text) [map over subset]
  Pipeline -->> Notebook: labels + scores (varied shapes)
  Notebook ->> Notebook: normalize outputs into {label, score}
  Notebook ->> Results: assemble DataFrame (text, emotion)
  Notebook ->> User: display results (optional save)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

In a tidy patch I nibble code and hay,
I opened sleepy cells so they greet the day.
I sampled tales from silver-screened review,
I nudged emotions into labels true.
Hop, hop—short review, then off I play. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title Check ❓ Inconclusive The title “3기_4주차_김수임” does not describe the substantive changes in the pull request and reads like an assignment identifier rather than summarizing the key modifications, making it too vague for a reviewer to understand the main updates at a glance. Please update the title to a concise, descriptive summary of the changes, for example “Remove collapsed cell metadata in huggingface_basics and add IMDB sentiment analysis notebook” so that reviewers can immediately grasp the PR’s purpose.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
softee220/huggingface_assignment.ipynb (1)

67-75: Import display explicitly.

display() isn’t defined in a plain Python runtime, and static analysis already flags it. Import it alongside the other notebook utilities to keep the code portable and lint-clean.

-from datasets import load_dataset, DatasetDict
-from transformers import pipeline
-import torch
-import pandas as pd # 데이터 확인용
+from datasets import load_dataset, DatasetDict
+from transformers import pipeline
+import torch
+import pandas as pd # 데이터 확인용
+from IPython.display import display
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 962235f and fab7daa.

📒 Files selected for processing (1)
  • softee220/huggingface_assignment.ipynb (1 hunks)
🧰 Additional context used
🪛 Ruff (0.13.3)
softee220/huggingface_assignment.ipynb

43-43: Undefined name display

(F821)


93-93: Undefined name display

(F821)


99-99: Undefined name display

(F821)

Comment on lines +65 to +75
"!pip install transformers datasets sentencepiece accelerate torch\n",
"\n",
"import datasets\n",
"from datasets import load_dataset, DatasetDict\n",
"from transformers import pipeline\n",
"import torch\n",
"import pandas as pd # 데이터 확인용\n",
"\n",
"# GPU 사용 가능 여부 확인 및 설정 (Colab에서는 보통 GPU 사용 가능)\n",
"device = 0 if torch.cuda.is_available() else -1\n",
"print(f\"사용 가능한 디바이스: {'GPU' if device == 0 else 'CPU'}\")"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid downgrading Colab’s CUDA-enabled torch.

Running pip install torch in Colab removes the preinstalled GPU build and pulls a CPU-only wheel, so torch.cuda.is_available() flips to False and all inference falls back to CPU. Drop the plain torch install (or replace it with a CUDA-specific wheel) to preserve GPU acceleration.

-!pip install transformers datasets sentencepiece accelerate torch
+!pip install transformers datasets sentencepiece accelerate
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"!pip install transformers datasets sentencepiece accelerate torch\n",
"\n",
"import datasets\n",
"from datasets import load_dataset, DatasetDict\n",
"from transformers import pipeline\n",
"import torch\n",
"import pandas as pd # 데이터 확인용\n",
"\n",
"# GPU 사용 가능 여부 확인 및 설정 (Colab에서는 보통 GPU 사용 가능)\n",
"device = 0 if torch.cuda.is_available() else -1\n",
"print(f\"사용 가능한 디바이스: {'GPU' if device == 0 else 'CPU'}\")"
!pip install transformers datasets sentencepiece accelerate
import datasets
from datasets import load_dataset, DatasetDict
from transformers import pipeline
import torch
import pandas as pd # 데이터 확인용
# GPU 사용 가능 여부 확인 및 설정 (Colab에서는 보통 GPU 사용 가능)
device = 0 if torch.cuda.is_available() else -1
print(f"사용 가능한 디바이스: {'GPU' if device == 0 else 'CPU'}")
🤖 Prompt for AI Agents
In softee220/huggingface_assignment.ipynb around lines 65 to 75, the notebook
unconditionally runs a plain "pip install torch" which on Colab replaces the
preinstalled CUDA-enabled torch with a CPU-only wheel and disables GPU; remove
the plain "pip install torch" line (or replace it with a Colab-compatible
CUDA-specific wheel only when needed) and leave the device detection code intact
so torch.cuda.is_available() can correctly detect the GPU, or add a conditional
install that skips torch reinstallation when a CUDA-enabled build is present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant