Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Story vs Narrative Modification POC #32

Merged
merged 6 commits into from
Oct 27, 2024

Conversation

leonvanbokhorst
Copy link
Owner

@leonvanbokhorst leonvanbokhorst commented Oct 27, 2024

Summary by Sourcery

New Features:

  • Introduce a proof of concept module for modifying stories using narrative modifiers such as tone, perspective, and purpose.

…can modify story components using vector embeddings. It shows how applying different tones to story events can result in varied embeddings, and calculates the similarity to measure the degree of change.

The core idea is to represent textual elements as vectors (embeddings) and then manipulate these vectors to simulate the effect of narrative choices on the story. This approach allows for a quantitative analysis of how narrative decisions impact the underlying meaning or representation of story elements.
Copy link
Contributor

sourcery-ai bot commented Oct 27, 2024

Reviewer's Guide by Sourcery

This PR implements a proof of concept for story modification using narrative elements. The implementation uses language models and embedding techniques to alter story presentations while maintaining core elements. The system processes story elements (events, characters, settings) and applies narrative modifiers (tone, perspective, purpose) through embedding combinations and text generation.

Class diagram for Story vs Narrative Modification POC

classDiagram
    class OllamaInterface {
        +generate_embedding(text: str) async
        +generate(prompt: str) async
        +cleanup() async
    }
    class EmbeddingCache {
        +get(key: str) -> List[float]
        +set(key: str, value: List[float])
        +clear()
    }
    class StoryNarrativeModifier {
        +embed_elements(elements_dict: Dict[str, Any]) async -> Dict[str, Any]
        +apply_narrative(story_embedding: List[float], narrative_embedding: List[float], weight: float) -> np.ndarray
        +generate_modified_text(original_text: str, modified_embedding: List[float], narrative_modifier: str) async -> str
        +main() async
    }
    StoryNarrativeModifier --> OllamaInterface
    StoryNarrativeModifier --> EmbeddingCache
    note for StoryNarrativeModifier "This class orchestrates the story modification process using narrative elements."
Loading

File-Level Changes

Change Details Files
Implementation of core story modification functionality
  • Created async function to generate and cache embeddings for story and narrative elements
  • Implemented narrative application logic using weighted combinations of embeddings
  • Added text generation function that applies narrative modifiers to original text
  • Developed main orchestration function that coordinates the entire modification process
src/poc_story_vs_narrative.py
Data structure setup for story and narrative elements
  • Defined sample story elements including events, characters, and settings
  • Created narrative modifier categories for tone, perspective, and purpose
  • Structured the data as nested dictionaries for easy access and modification
src/poc_story_vs_narrative.py
Integration with external dependencies and utilities
  • Set up logging configuration and error handling
  • Integrated with language model interface for text generation
  • Implemented embedding cache for optimization
  • Added cosine similarity calculations for comparing embeddings
src/poc_story_vs_narrative.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@leonvanbokhorst leonvanbokhorst merged commit 3de3007 into main Oct 27, 2024
1 check passed
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @leonvanbokhorst - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider moving the sample story elements and narrative modifiers to a configuration file for better maintainability and easier testing with different datasets.
Here's what I looked at during the review
  • 🟡 General issues: 2 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

}


async def embed_elements(elements_dict: Dict[str, Any]) -> Dict[str, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): Consider using asyncio.gather() to parallelize embedding generation for better performance

The current implementation processes embeddings sequentially. Using asyncio.gather() would allow concurrent processing of multiple embeddings, significantly improving performance for larger datasets.

async def embed_elements(elements_dict: Dict[str, Any]) -> Dict[str, Any]:
    async with asyncio.TaskGroup() as tg:
        tasks = [tg.create_task(embed_element(k, v)) for k, v in elements_dict.items()]
    return dict(await asyncio.gather(*tasks))

return embedded_dict


def apply_narrative(story_embedding: List[float], narrative_embedding: List[float], weight: float = 0.5) -> np.ndarray:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Add validation for the weight parameter to ensure it's between 0 and 1

Invalid weight values could lead to unexpected results. Consider adding a check like if not 0 <= weight <= 1: raise ValueError('Weight must be between 0 and 1')

Suggested change
def apply_narrative(story_embedding: List[float], narrative_embedding: List[float], weight: float = 0.5) -> np.ndarray:
def apply_narrative(story_embedding: List[float], narrative_embedding: List[float], weight: float = 0.5) -> np.ndarray:
if not 0 <= weight <= 1:
raise ValueError("Weight must be between 0 and 1")

for category, mods in narrative_modifiers.items()
}

# Modify and observe how the narrative affects the story
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider extracting the nested event processing logic into a dedicated function to improve code organization.

The main() function's nested loop structure makes the code flow difficult to follow. Consider extracting the event modification logic into a separate function:

async def process_event(
    event_name: str,
    event_embedding: List[float],
    embedded_narratives: Dict
) -> Dict:
    """Process a single event with all narrative modifiers."""
    results = {}
    for category, modifiers in embedded_narratives.items():
        for modifier_name, modifier_embedding in modifiers.items():
            modified_embedding = apply_narrative(
                event_embedding, modifier_embedding, weight=0.6
            )
            similarity = cosine_similarity([event_embedding], [modified_embedding])[0][0]

            modified_text = await generate_modified_text(
                event_name, modified_embedding, f"{category}: {modifier_name}"
            )

            results[(category, modifier_name)] = {
                "modified_embedding": modified_embedding,
                "similarity": similarity,
                "modified_text": modified_text,
            }
    return results

async def main():
    # ... embedding generation code ...

    modified_story = {}
    for event_name, event_embedding in zip(
        story_elements["events"], embedded_story["events"]
    ):
        event_results = await process_event(
            event_name, event_embedding, embedded_narratives
        )
        modified_story.update({
            (event_name, cat, mod): data 
            for (cat, mod), data in event_results.items()
        })

This refactoring:

  1. Reduces nesting depth from 3 to 2 levels
  2. Makes the event processing logic easier to test and modify
  3. Improves readability by separating concerns

@leonvanbokhorst leonvanbokhorst deleted the poc_story_vs_narrative.py branch October 27, 2024 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant