Skip to content

A photorealistic Japanese VR museum featuring ukiyo-e artworks with a RAG-powered local AI guide that can hear and speak to visitors. Non-invasive visitor behaviour tracking and analytics for exhibition optimization. Privacy-first, runs fully offline. Built with Unity intigrating technologies like Ollama LLMs, RAG, Whisper, Powershell TTS and more!

License

Notifications You must be signed in to change notification settings

KuhakuNeko/VR-AI-Museum

Repository files navigation

🏯 Adaptive VR AI Museum

License Unity VR AI Visits

A photorealistic Japanese VR museum featuring ukiyo-e artworks by Katsushika Hokusai, Utagawa Hiroshige, and Yamamoto Heikichi. Experience along with an intelligent AI guide powered by local LLMs with RAG technology that provides deep cultural insights. All visitor interactions are non-invasively logged locally for post-visit analysis by museum professionals.

For visitors: a highly personalized, engaging, and non-restrictive museum experience.

For professionals and enthusiasts: a decision-support tool for curatorial decisions and museum reconfigurations.

Built to honor Japan and its rich cultural heritage 🇯🇵

MuseumSystemShowcase


📋 Table of Contents


👀 Overview

This project is a fully immersive Japanese VR museum that combines AI technology with cultural heritage preservation. Unlike traditional virtual museums with pre-recorded audio guides, this system features a locally-running AI assistant with deep knowledge of the Japanese art in the museum, powered by RAG (Retrieval-Augmented Generation).

While visitors enjoy an engaging museum experience, the system non-invasively logs all interactions (such as gaze patterns, movement through space, conversation, engagement duration, survey replies and more). This data is exported in structured formats that museologists and cultural heritage professionals can analyze using external AI tools (like ChatGPT, Deepseek, Gemini, Claude or any other capable LLMs) to gain insights into visitor behavior, identify exhibition strengths and weaknesses, and optimize future museum layouts and content.

Important Notes I am not a professional museologist or curator. This museum's layout, artwork selection, and exhibition design were created independently by me. The system demonstrates the technical framework and data collection capabilities, but the curatorial decisions reflect my personal interpretation. I chose these artworks, layout and style because i love Japan.

This project is not considered complete as it is not yet extensively tested. There will always be room for improvements to be done. Contributions from the community are welcome - feel free to open issues, suggest ideas, or submit pull requests.

What This Museum Offers

For Visitors:

  • 🎨 Explore a photorealistic Japanese museum with works by legendary ukiyo-e masters
  • 🤖 Converse naturally with an AI guide that deeply understands the specific Japanese art and culture presented in the museum
  • 🔒 Complete privacy - everything runs locally on your computer, no data sent to the cloud. No personal identifiers are saved.
  • 🎧 Voice interaction with natural speech-to-text and text-to-speech
  • 🏛️ Authentic museum atmosphere with traditional Japanese aesthetics

For Museum Professionals & Researchers:

  • 📊 Non-invasive visitor behavior logging (gaze tracking, movement patterns, conversations and more..)
  • 📈 Structured data export in 4 formats (CSV, TXT, JSON) for analysis and statistics
  • 🤖 Use exported data with external LLMs (ChatGPT, Deepseek, Gemini, Claude and more..) for new museum layout recommendations and general support in curatorial decisions
  • 📝 Statistical reports and insights generated through external LLMs using the visitors' anonymized data
  • 🎓 Research tool for understanding visitor engagement patterns
  • 🔄 Iterative improvement - Data from multiple visits reveals patterns for adaptation

The Complete Workflow:

  1. Visit Experience Phase

    • Visitors explore the VR museum naturally
    • High-quality local AI chatbot provides replies to any of their questions related to the exhibits
    • All interactions are non-invasively logged in the background
    • Exiting the museum the visitor is given a short survey to fill out which provides even more insights
  2. Data Collection Phase

    • HMD sensors track: XYZ coordinates, gaze direction, orientation
    • Virtual sensors log: Dwell time near exhibits, exhibit IDs, movement patterns
    • Chatbot records: All conversations, questions asked, topics explored
    • Survey system captures: Post-visit feedback and satisfaction ratings
  3. Data Export Phase

    • System compiles 4 structured files per visit
    • Formats: CSV (gaze data), TXT (journey narrative, chat logs), JSON or CSV (survey)
    • All data stays local unless the visitor chooses to share them
  4. Professional Analysis Phase

    • Collect data from multiple visitors
    • Upload logs to external LLM (ChatGPT, Claude, Gemini, DeepSeek, etc.)
    • Use prompt engineering to instruct LLM as museum curator analyst
  5. Insight Generation Phase

    • LLM performs visitor profiling from behavioral patterns
    • Generates statistical reports on engagement metrics
    • Identifies layout optimization opportunities
    • Suggests experience enhancement strategies
    • Outputs natural language analytical reports for non-technical professionals
  6. Museum Improvement Phase

    • Apply insights in 3D or even the real museum counterpart
    • Adjust content based on visitor interests
    • Preferably test changes in VR environment (risk-free)
    • Future visits benefit from optimized design

🎨 Museum Collections

This virtual museum showcases the works of three masters of Japanese ukiyo-e (浮世絵):

🌊 Katsushika Hokusai (葛飾北斎) 1760-1849

Japan's most internationally recognized artist, creator of the iconic "Great Wave off Kanagawa" and the "Thirty-Six Views of Mount Fuji" series. Hokusai revolutionized landscape art and profoundly influenced Western Impressionism.

In the Museum:

  • The Great Wave off Kanagawa
  • Red Fuji (Fine Wind, Clear Morning)
  • Multiple views from the Mount Fuji series
  • Examples of his dynamic composition style

🏞️ Utagawa Hiroshige (歌川広重) 1797-1858

Master of lyrical landscape prints, renowned for his atmospheric depictions of nature and the changing seasons. His "Fifty-three Stations of the Tōkaidō" captured the spirit of travel in Edo-period Japan.

In the Museum:

  • Stations from the Tōkaidō road series
  • One Hundred Famous Views of Edo
  • Seasonal landscape scenes
  • Examples of his poetic use of color and weather

🎭 Heikichi (平吉)

A skilled ukiyo-e artist whose works complement the collection, showcasing additional perspectives on Japanese artistic traditions and cultural themes.

In the Museum:

  • Representative works demonstrating ukiyo-e techniques
  • Cultural and artistic contributions to the genre

*These artworks are now public domain.


💾 System's Exports

What This System Provides

This museum is also a research tool for understanding visitor engagement and optimizing exhibitions. It currently exports 4 main files which are the following:

Data Outputs (4-5 Files per Visit)

1. Gaze Log (CSV) - (gaze/GazeLog_[ID].csv)

  • Which artworks were looked at
  • How long the visitor spent viewing each piece
  • Order of the artworks visited

Quantitative data on artwork engagement:

Hokusai Exhibition Room|The Great Wave off Kanagawa|58,10
Hokusai Exhibition Room|South Wind, Clear Sky|15,72
Hokusai Exhibition Room|The Great Wave off Kanagawa|8,20
Hiroshige Exhibition Room|Sudden Shower over Shin-Ohashi Bridge|19,60
Hiroshige Exhibition Room|100 Famous Views of Edo|24,48

Used for:

  • Identifying high/low engagement artworks
  • Understanding visual attention patterns
  • Track engagement distribution

2. Journey Log (TXT) - (journey/JourneyLog_[ID].txt)

Human-readable visitor path based on the gaze log:

The visitor begins their journey at **Hokusai Exhibition Room** in the Museum. Immediately after, they observed "The Great Wave off Kanagawa" for 10s. Continuing on, they spent 58s looking at "The Great Wave off Kanagawa". Subsequently, they studied "The Great Wave off Kanagawa" for 20s. Immediately after, the visitor made their way to another section of the **Hokusai Exhibition Room**. Immediately after, they admired "South Wind, Clear Sky" for 72s. Afterwards, they admired "South Wind, Clear Sky" for 15s. Continuing on, the visitor moved on to the **Hiroshige Exhibition Room**. Continuing on, they inspected "Sudden Shower over Shin-Ohashi Bridge" for 60s. Subsequently, they inspected "Sudden Shower over Shin-Ohashi Bridge" for 19s. Then, the visitor advanced to another section of the **Hiroshige Exhibition Room**. Then, they spent 48s looking at "100 Famous Views of Edo". Next, the visitor carefully looked at "100 Famous Views of Edo" for 24s. In summary, they spent 165s in the **Hokusai Exhibition Room**, 151s in the **Hiroshige Exhibition Room**, with a total of 316s inside the Japanese Museum.

Used for:

  • Understanding navigation patterns
  • Identifying bottlenecks or flow issues
  • Recognizing thematic interest clusters
  • Visualize visitor journey mentally

3. Chat Log (TXT) - (chat/ChatLog_[ID].txt)

A conversation transcript with the complete dialogue of the visitor and the AI guide:

[0:50] Visitor: Hello, can you tell me some info and the meaning of this artwork? (The visitor is currently at the Hokusai Exhibition Room. The visitor is currently looking at the artwork / exhibit title: The Great Wave off Kanagawa. ONLY GIVE ANSWERS ABOUT THIS SPECIFIC ARTWORK.)

[1:15] Museum Guide (AI): The Great Wave off Kanagawa is one of the most iconic woodblock prints by Japanese artist Katsushika Hokusai, created as part of his series Thirty-Six Views of Mount Fuji. The artwork depicts a towering wave curling with immense energy, seemingly about to engulf small boats, while Mount Fuji stands calmly in the background. The contrast between the transient power of nature and the enduring stability of the mountain reflects themes of human vulnerability, resilience, and the sublime force of the natural world.

[2:00] Visitor: When was this created exactly? What is the date? (The visitor is currently at the Hokusai Exhibition Room. The visitor is currently looking at the artwork / exhibit title: The Great Wave off Kanagawa. ONLY GIVE ANSWERS ABOUT THIS SPECIFIC ARTWORK.)

[2:27] Museum Guide (AI): The Great Wave off Kanagawa was created around 1831, during the late Edo period in Japan. It was published as part of Hokusai’s Thirty-Six Views of Mount Fuji series, which celebrated the cultural and spiritual significance of the mountain. The print quickly became famous both in Japan and internationally, influencing generations of artists and shaping the global appreciation of Japanese ukiyo-e woodblock prints.

Used for:

  • Understanding what information visitors seek
  • Identifying knowledge gaps in current exhibits
  • Revealing visitor interests and curiosity patterns
  • Guide content development

4. Survey Data (JSON or CSV) - (Downloaded through your browser as Survey_[ID].json and Survey_[ID].csv)

Structured post-visit feedback:

{
  "first_time_visit_in_the_museum": "Yes",
  "visitor_age_group": "18-25",
  "visitor_background_in_cultural_heritage": "None",
  "primary_motivation_for_visiting_today": "casual_learner",
  "overall_virtual_museum_experience_rating": "4",
  "most_enjoyed_exhibition_room": "Hokusai",
  ...
}

Used for:

  • Measuring visitor satisfaction
  • Collecting qualitative feedback
  • Identifying areas for improvement
  • Track trends across multiple visitors
  • Visitor profile clustering

Using Data for Exhibition Optimization

Step 1: Collect Visitor Data

  • Run the museum with multiple visitors
  • Gather the 4-5 output files from each visit
  • Organize files by visitor ID for analysis

Step 2: LLM-Based Analysis Phase Provide the collected data to external LLMs (ChatGPT, Claude, Gemini, DeepSeek) with specialized prompts. Make sure to let the LLM know the current exhibition layout and how the museum is currently structured as well as the exhibits included. Without this info the LLM will have no idea how to optimize the museum as it does not know the layout that the visitors traversed through in any way.

Step 3: Insight and Statistics Interpretation The LLM will analyze patterns and provide:

  • Statistical engagement metrics
  • Layout optimization recommendations
  • Content gap identification
  • Navigation flow improvements As an enthusiast or Professional you can reason over them and ask the LLM further question to assist curatorial decisions.

Step 4: Implementation & Testing

  • Apply recommended changes to VR museum layout
  • Run new visitor sessions to validate improvements
  • Compare before/after engagement metrics
  • Iterate based on results

Step 5: Physical Museum Application (Optional)

  • Once validated in VR, apply insights to physical museum spaces
  • Lower risk due to VR testing phase
  • Evidence-based decisions for stakeholders

The tool's role is supportive and not a replacement of professional museologists.


💻 Info for Devs

If you want to further check the systems implementation you can download the project and install its dependencies as described in the later "Unity Installation" section. This section quickly brushes over the internal scripts for developers who want to understand the codebase.

The museum operates through core subsystems that work together:

Subsystem Description Key Scripts (Most Relevant)
Gaze Tracking System Detects and logs what visitors look at, including timing and exhibit focus. GazeInteractable.cs, GazeInteractor.cs (GazeInteraction/), GazeSense.cs (Dimitris Kostadimas/Scripts/)
Spatial Tracking System Tracks visitor position, room transitions, and movement patterns. PositionSense.cs, RoomSense.cs (Dimitris Kostadimas/Scripts/)
AI Conversation System Sends enriched context to the local Ollama model and handles AI responses. OllamaIntegration.cs (Dimitris Kostadimas/Scripts/)
RAG Knowledge System Retrieves relevant exhibit documents for AI responses using embeddings + cosine similarity search. OllamaIntegration.cs (Dimitris Kostadimas/Scripts/)
TTS & STT Modules Manages voice input via Whisper and voice output via a PowerShell TTS module. WhisperControl.cs, ChatbotTTS_Powershell.cs (Dimitris Kostadimas/Scripts/)
Data Logging System Recording gaze, chat, movement, room presence, and survey data using unified session IDs. GazeSense.cs, RoomSense.cs, PositionSense.cs, ChatbotLogger.cs (Dimitris Kostadimas/Scripts/), custom HTML survey page

🎯 Core Scripts Breakdown

Gaze Tracking: GazeSense.cs

Purpose: Tracks which exhibits visitors look at and for how long.

  • Uses Unity raycasting from HMD to detect exhibit focus
  • Filters out brief glances (configurable minGazeTimeThreshold)
  • Exports data in two formats:
    • CSV (gaze_log_[ID].csv) - Structured quantitative data
    • TXT (journey_log_[ID].txt) - Natural language narrative

Spatial Tracking: RoomSense.cs + PositionSense.cs

RoomSense.cs

Purpose: Detects which room/gallery the visitor is currently in through trigger collider based detection.

Detection Modes:

public enum DetectionMode { 
    Triggers,        // Uses Unity trigger colliders (physics-based)
    PositionCheck,   // Manual bounds checking each frame
    Auto             // Automatically chooses best method
}

PositionSense.cs (WIP)

Purpose: Validates that visitor is physically near an exhibit before logging gaze. Why This Matters: Prevents false positive logging gaze interactions when visitor looks at an exhibit from across the room.

AI Conversational Chatbot: OllamaIntegration.cs

Purpose: Core AI brain - handles LLM communication and RAG system.

The system works by synthesising the visistors question and current museum context and the request gets sent over to Ollama through local API calls.

Architecture:

┌─────────────────────────────────────────────┐
│         OllamaIntegration.cs                │
├─────────────────────────────────────────────┤
│                                             │
│  ┌──────────────┐      ┌──────────────┐     │
│  │ Normal Mode  │      │   RAG Mode   │     │
│  │              │      │              │     │
│  │ User Prompt  │      │ User Prompt  │     │
│  │      ↓       │      │      ↓       │     │
│  │   Ollama     │      │  Extract     │     │
│  │   /generate  │      │  Keywords    │     │
│  │      ↓       │      │      ↓       │     │
│  │   Response   │      │  Embed Query │     │
│  └──────────────┘      │      ↓       │     │
│                        │  Similarity  │     │
│                        │  Search      │     │
│                        │      ↓       │     │
│                        │  Top-K Docs  │     │
│                        │      ↓       │     │
│                        │  Augmented   │     │
│                        │  Prompt      │     │
│                        │      ↓       │     │
│                        │   Ollama     │     │
│                        │   /generate  │     │
│                        │      ↓       │     │
│                        │   Response   │     │
│                        └──────────────┘     │
└─────────────────────────────────────────────┘

RAG System

The RAG subsystem enriches the AI’s responses by retrieving relevant exhibit information from a local document database. It works in two stages:

1. Knowledge Base Initialization

  • All exhibit documents (TextAssets) are loaded at startup.
  • Optional text chunking is applied for large documents.
  • Each document or chunk is converted into an embedding vector using the selected embedding model.
  • The system is ready once all vectors are stored for similarity search.

2. Query Processing

  • The visitor’s question is analyzed to extract the most likely artwork/topic.
  • That keyword is embedded and compared against all stored vectors using cosine similarity.
  • The top-K most relevant documents are selected.
  • These documents are appended as context to the visitor’s query.
  • The augmented prompt is then sent to the local LLM (Ollama) for a more accurate exhibit-aware response.

Configurable Parameters

  • EnableRAG — Enables or disables the entire retrieval system.
  • EmbedModel — The embedding model used for vector generation (e.g., nomic-embed-text).
  • TopK — Number of documents to retrieve per query (default: 3).
  • UseChunking — Splits long documents into smaller chunks to improve retrieval accuracy.
  • ChunkCharLimit — Maximum size per chunk when chunking is enabled.

This system ensures that AI responses remain grounded in the museum’s curated knowledge base and can be updated simply by modifying the included documents.

TTS & STT Modules: WhisperControl.cs + ChatbotTTS_Powershell.cs

WhisperControl.cs along with other Whisper.Unity scripts

Purpose: Handles speech-to-text input from visitors. In this script the visitor prompt is getting enriched with other content (such as the visitors current focus) and then is sent over to Ollama. Visitor presses button → Microphone records → Whisper script transcribes and enriches the prompt → Send to Ollama

ChatbotTTS_Powershell.cs

Purpose: Converts AI text responses to speech using Windows TTS. The Ollama LLM reply is passed over to this script and by running a powershell command the reply is spoken out loud to the visitor.

This approach was chosen as it was simple and avoid 3rd party libraries and external dependencies. It runs asynchronously and the voice quality is decent without making the system more demanding to run. However, it is limited to Windows systems.

General Info

🤖 The Intelligent Local AI Guide

Powered By:

  • Ollama - Local LLMs running on your machine
  • RAG Technology - Retrieves relevant information from curated knowledge base
  • Nomic Embeddings - Semantic search for contextual responses
  • Context Awareness - Knows which artwork you're viewing via gaze tracking

How It Works:

  1. You approach an artwork and speak naturally
  2. The system detects which piece you're viewing (via gaze tracking)
  3. AI retrieves relevant knowledge from its curated database
  4. Generates personalized, contextual response
  5. Speaks answer back to you in natural voice

RAG (Retrieval-Augmented Generation) Pipeline:

  1. Knowledge Base Creation:

    • Curated documents about Japanese art, artists, history
    • Embedded into vector database using Nomic
    • Includes ukiyo-e techniques, cultural context, artist biographies
  2. During Conversation:

    • User asks question near artwork
    • Gaze tracking identifies which artwork
    • Question + artwork context embedded as query vector
    • Cosine similarity search finds relevant knowledge passages
    • Top-k passages injected into LLM prompt
    • LLM generates contextual, accurate response
  3. Why This Works:

    • Less AI hallucinations
    • Responses grounded in real art historical knowledge
    • Context-aware based on what you're viewing
    • Conversational memory for follow-up questions

🔄 Complete AI Interaction Flow

Here's how all systems work together when a visitor asks a question:

┌─────────────────────────────────────────────────────────────┐
│ 1. VISITOR PRESSES BUTTON (WhisperControl.cs)               │
└───────────────────────────┬─────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────┐
│ 2. MICROPHONE RECORDS AUDIO                                 │
└───────────────────────────┬─────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────┐
│ 3. WHISPER TRANSCRIBES → "Tell me about this artwork"       │
└───────────────────────────┬─────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────┐
│ 4. CONTEXT ENRICHMENT (WhisperControl.cs)                   │
│    - RoomSense: "Visitor at Gallery_Hokusai"                │
│    - GazeSense: "Looking at The Great Wave"                 │
│    → Enriched: "Tell me about... (at Hokusai, looking..."   │
└───────────────────────────┬─────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────┐
│ 5. LOG VISITOR MESSAGE (ChatbotLogger.cs)                   │
│    → ChatLog_ID5.txt: "[0:15] Visitor: Tell me about..."    │
└───────────────────────────┬─────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────┐
│ 6. SEND TO OLLAMA (OllamaIntegration.cs)                    │
│                                                             │
│    IF RAG ENABLED:                                          │
│    ├─ Extract topic: "The Great Wave off Kanagawa"          │
│    ├─ Embed topic → [0.2, -0.5, 0.8, ...]                   │
│    ├─ Search knowledge base (cosine similarity)             │
│    ├─ Retrieve top-3 relevant documents                     │
│    └─ Augment prompt with context                           │
│                                                             │
│    → POST to http://localhost:11434/api/generate            │
└───────────────────────────┬─────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────┐
│ 7. OLLAMA GENERATES RESPONSE                                │
│    → "The Great Wave off Kanagawa is one of Hokusai's..."   │
└───────────────────────────┬─────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────┐
│ 8. HANDLE REPLY (OllamaIntegration.cs)                      │
│    ├─ Log to ChatbotLogger                                  │
│    ├─ Display as subtitle (if enabled)                      │
│    └─ Send to TTS (if enabled)                              │
└───────────────────────────┬─────────────────────────────────┘
                            ▼
┌─────────────────────────────────────────────────────────────┐
│ 9. TEXT-TO-SPEECH (ChatbotTTS_Powershell.cs)                │
│    → Windows TTS speaks the response                        │
└─────────────────────────────────────────────────────────────┘

🔒 Privacy by Design

Everything stays on your computer:

  • AI runs locally via Ollama (no cloud API calls)
  • Speech processing happens on your machine
  • Logs stored locally only
  • No telemetry, no tracking, no external data transmission
  • No personal identifiers are logged. Instead each log file is named after an incremental ID number.

You are in control:

  • Review all logged data in plain text files
  • Delete logs anytime
  • Choose what to share with researchers (if anything)
  • Optional: Use external LLMs for analysis, but YOU decide

⚙️ System Requirements

To run the Virtual AI Museum, your system should meet the following requirements. The system can also run without a VR device! You can simply use "WASD" to move and "M" to start/stop talking to the AI guide.

Operating System: Windows 10 (64-bit) or later is required.

VR Hardware (Optional): Any Unity XR-compatible headset will work. The system has been tested with Meta Quest 2.

VR Software: You’ll need Meta Quest Link or SteamVR, depending on your headset setup (since connection with a Windows PC is required)

AI Server: Install Ollama and ensure it runs locally at http://localhost:11434.

AI Model: Choose and install one compatible model via Ollama, such as Gemma2:2B, DeepseekV2, Llama3, GPT-OSS or any other...

Embedding Model: The system requires nomic-embed-text for retrieval-augmented generation (RAG) functionality.

Microphone: A microphone is essential for voice interaction. It’s recommended to use the built-in mic on your VR headset.

Browser: Firefox is recommended and should be set as your default browser to support survey downloads at the end of each visit. This is recommended because Firefox allows for silmutaneous downloads by default and the survey webpage does just that by download both JSON and CSV exports.

Disable Antivirus Software (Optional): It is recomended that you temporarily disable you antivirus software if you face any issues with the app due to the fact that it creates several background proccesses that might get interupted by the antivirus software.

Unity + Assets (Optional): It is recomended that you run the app from within the Unity engine to avoid any incompatibility issues but this is not necessary. If you do decide to run it from within Unity you should install a version greater than 6000.2.8f1 and also install all the required assets as described in the Installation section.


📥 Executable Installation

Step-by-Step Setup

1. Install Ollama

Download and install Ollama from https://ollama.ai/

1.2 Pull a lightweight LLM

I recomend installing the gemma2:2b because it is very lightweight and through testing had acceptable performance and reply time. It is currently set as the default model for the system. If you downloaded the executable software youll have to pull this one by running this command in your windows terminal (CMD):

ollama pull gemma2:2b

If your computer has better performance and you plan to run this app through unity you can change the model used by the software through the CustomAI-DimitrisKostadimas -> OllamaLLM object. On it you will find an attached OllamaIntegration script with several options including the "ollama model name". In such case i can recommend DeepseekV2 that can be pulled by the following command:

ollama pull DeepseekV2
1.3 Pull embedding model for RAG

As the impementation uses Retrieval-Augmented Generation (RAG) you will need an embedding model. Same as the previous step you can pull any model you want and modify the script in Unity but if you do run the museum through the executable then just pull the following model:

ollama pull embeddinggemma
1.4 Start Ollama Server

For the museum to work properly you have leave you downloaded model run in the background. To do that you can open a CMD window and run the following command:

ollama run gemma2:2b

If you end up using another LLM replace the "gemma2:2b" with your downloaded LLM name.

2. Download and Run the Museum

Pre-built Executable:

  1. Download latest release from Releases
  2. Run the installer
  3. After installation, run the executable
  4. Enjoy your visit!

As mentioned in the requirements section:

  • Make sure you set your default microphone to the one of your VR device if you use a VR headset
  • Disable your antivirus if you face issues
  • Recommended: Download and set firefox as you default browser

👾 Unity Installation

If you plan on running the software in unity in addition to step 1 (Installing Ollama, LLMs, Embedding models) of the previous section you will need to install all the following componnents and assets which are not included due to licensing reasons. But dont worry! They are fully free and easy to install. Here are the steps for Unity:

1. Download Unity Project

Download this repository and open it as a Unity project through Unity hub.

2. Download and install Ollama + LLMs + Embedding Models

Download and run Ollama models as explained in step 1 of the previous section.

3. Install the Required Assets and Components

For the project to work properly you will have to install the following libraries/assets/components:

  1. Unity Standard Assets

This is a deprecated package and Unity decided to make it completely dissapear from the earth's existance. However there are third‑party repositories (like 1, 2, 3) that may host versions of the package. These repositories are not maintained, endorsed, or affiliated with Unity or with this project. Use them at your own discretion and risk.

  1. XR Interaction Toolkit

You can download and install the XR Ineraction Toolkit through the Unity package manager either by typing its name in the search or try searching for it with the identifier com.unity.xr.interaction.toolkit.

Additionally a copy of it is hosted by the official Unity GitHub account in this repository: Unity-Technologies/XR-Interaction-Toolkit-Examples

  1. TextMeshPro

You can download and install the TextMeshPro package through the Unity package manager either by simply searching for “TextMeshPro” or using the identifier com.unity.textmeshpro.

  1. All the following assets:

Apartment Door - DevDen

Japanese zelkova - Keyaki - HoloLab Store

Japanese Street Lamp - Edo Factory

Low Poly Casual Fox Mask - Revereel Studio

Shrine Pack (Torii, StoneLantern, PaperLantern) - logicalbeat

Wakizashi Short Sword - Tanuki Digital

All of them can be downloaded directly from the Asset Store. After downloading move their respective folder and files to the "AssetStorePrefabs" file found in the project.

4. Open the VR-AI Museum scene.

You can find this scene in the "Dimitris Kostadimas" folder.

5. Optional Upgrades

As the project makes use of Whisper.cpp it makes use of certain bin files which aremodels for speech recognition and can be found in "Assets/Dimitris Kostadimas/Models".

Due to GitHub not allowing to upload files larger than 100MB the "ggml-medium" model that was supposed to be used for the project was removed and was replace with the "ggml-base-q8_0".

However, you can download the "ggml-medium" or any other enhanced version of the model through the whisper.cpp repository on hugging face here: ggerganov/whisper.cpp

After downloading the coresponding bin file copy it in the "Assets/Dimitris Kostadimas/Models" folder. Then, find the "WhisperSTT" object in the hierarchy and modify the "WhisperManager" script inspector property called "Model Path" so it contains the correct file name.


⚡ Quick Start

First Visit

  1. Start Ollama with the corresponding model in a terminal (keep it running)
  2. Put on VR headset or get on your keyboard and launch the application
  3. Explore freely approaching artworks that interest you
  4. Speak naturally to ask the AI questions (use the "A" button on Oculus controllers or the "M" button on the keyboard)
  5. Complete survey at exit (opens automatically in browser)

Tips for Best Experience

  • Speak clearly and wait for AI response (processing takes a few seconds but performance may vary depending on the computer specs)
  • Look directly at artworks before asking questions (helps AI know context)
  • Ask any question about the exhibit to the AI agent
  • Take your time - there's no rush, explore at your own pace

After Your Visit

Your data is saved at VR AI Museum by Dimitris Kostadimas_Data\Dimitris Kostadimas (or Assets/DimitrisKostadimas/Logs/ within Unity):

  • gaze/ - Viewing duration per artwork
  • journey/ - Natural language path description
  • chat/ - Conversation transcript

The survey log files will be downloaded automatically after you click "Submit Form" in the survey page which can be found at dimitriskostadimas.gr/VR-AI-Museum-Survey.html. These files will be saved at your browser default download location.


🔧 Technical Details

Technology Stack

VR & Rendering:

  • Unity Engine 6000.2.8f1 +
  • Unity XR Interaction Toolkit
  • Custom Photorealistic 3D environment

AI & NLP:

  • Ollama (Local LLM server)
  • Recommend Gemma2 (Language models)
  • Whisper.cpp - Whisper.Unity (Speech-to-Text, local)
  • Windows PowerShell TTS
  • Custom RAG implementation
  • Embeddinggemma (Vector search - Embedding model)

Data Collection:

  • Gaze tracking (raycasting)
  • Position/rotation logging
  • Event-driven chat recording
  • Web-based survey system

🛠️ Troubleshooting

AI Guide Not Responding

✓ Check Ollama is running: http://localhost:11434
✓ Verify models are installed by running "ollama list" on CMD
✓ Check firewall not blocking port 11434
✓ Disable antivirus temporarily

Microphone Not Working

✓ Set VR headset mic as Windows default
✓ Test in Windows Sound Settings

Log Files Not Generating

✓ Run software as Administrator
✓ Check write permissions
✓ Disable antivirus temporarily
✓ Verify PlayerID.txt exists

Performance Issues

✓ Use lighter model (gemma2:2b)
✓ Close background apps
✓ Use wired VR connection

📬 Share Your Visitor Data

Help Improve This Museum!

If you've visited this museum and would like to contribute your anonymized visitor data to help improve the experience in a possible future itteration of it, you can share your logs in two ways:

Option 1: Email Your Logs

Send your exported files to: [email protected]

Include all 4 files from your visit:

  • gaze_log_[ID].csv
  • journey_log_[ID].txt
  • chat_log_[ID].txt
  • survey_[ID].json

Option 2: Pull Request on GitHub

For users comfortable with Git:

  1. Fork this repository
  2. Add your log files to the respective folders of: Assets/DimitrisKostadimas/Logs/
    • Place in appropriate subfolders (gaze/, journey/, chat/, survey/)
  3. Create a pull request titled: Visitor Logs for [vX.X] [Date]
    • You replace the placeholders with the software version you use as well as the date you created your pull request (e.g. Visitor Logs for v1.0 24/11/2025)

👤 Author

Dimitris Kostadimas (Kuhaku Neko)


📖 Citation

If you use this museum in research:

@software{kostadimas2025japanese_vr_ai_museum,
  author = {Kostadimas, Dimitris},
  title = {Japanese VR AI Museum},
  year = {2025},
  url = {https://github.com/KuhakuNeko/VR-AI-Museum},
  note = {VR museum featuring Japanese ukiyo-e with RAG-powered 
          AI guide and visitor behavior logging}
}

📄 License

© 2025 Dimitris Kostadimas. All rights reserved.

Licensed under Custom Non-Commercial License.

  • ✅ Free for personal, educational, and research use
  • ❌ Commercial use requires written permission
  • 📋 See LICENSE.md for full terms
  • Third-party components/assets use their own respective licenses

Acknowledgments

  • Japan 🇯🇵 - For its unparalleled artistic heritage that inspired this project
  • Hokusai, Hiroshige, and Heikichi - Masters whose works grace this virtual museum
  • Ollama Community - Making local AI accessible
  • Open Source Contributors - Whisper.cpp, Gemma, and countless tools
  • The following open-source GitHub projects that were used in this project as well:
  • All the free assets creators for making such cool assets accecible!

Built with deep respect for Japanese culture and art

⭐ Star this repo if you like it.

About

A photorealistic Japanese VR museum featuring ukiyo-e artworks with a RAG-powered local AI guide that can hear and speak to visitors. Non-invasive visitor behaviour tracking and analytics for exhibition optimization. Privacy-first, runs fully offline. Built with Unity intigrating technologies like Ollama LLMs, RAG, Whisper, Powershell TTS and more!

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

Packages

No packages published