Project AIRI: Re‑creating Neuro‑sama’s Magic—An Open, Portable “Soul Container” for AI VTubers and Digital Companions

What if your favorite AI VTuber didn’t disappear when the stream ended—because she lived on your desktop, your phone, even in your browser? What if she could talk to you on Discord, hop into your Factorio save, check what you’re coding, and remember last night’s jokes? That’s the promise of Project AIRI: an open, extensible container for “AI souls” (VTubers, digital waifus/husbandos, companions, pets) that you can run anywhere, anytime.

Heavily inspired by Neuro‑sama, AIRI goes beyond chat. It blends modern web tech (WebGPU, WebAssembly, WebAudio) with native performance (CUDA, Metal) to create a real-time, game‑capable, memory‑savvy virtual being that feels present—not just responsive. And because AIRI is open and modular, you can make your own character, wire in your favorite LLM provider, and extend it with plugins for games, voice, and more.

Before we dive in, a quick but important note:

No token. No coin. No crypto. There is no official cryptocurrency or token associated with Project AIRI. If you see one, it isn’t us. Please proceed with caution.

Let’s explore what makes AIRI different, how it works, what it can do today, and how you can build or contribute to your own cyber‑living companion.

Why AI companions are moving beyond chat

First came chat-first platforms like Character.ai, JanitorAI, and local playgrounds like SillyTavern. They made roleplay and character chat accessible to everyone. But talking is only part of companionship.

We want our companions to see, hear, and act in our worlds.
We want them to play games with us, watch videos, and jump into co‑op sessions.
We want them to remember, grow, and show consistent personality—no matter the platform.

Neuro‑sama proved what’s possible when an AI can chat and play games live with an audience. But the implementation isn’t open, and after streams end, the experience stops. AIRI aims to push that frontier into your hands: an open framework where you own and operate your digital life-form across devices.

Here’s why that matters: it’s not just about “talking to an AI.” It’s about sharing a world with one.

What makes Project AIRI special

AIRI takes a web‑first approach without sacrificing native performance. Think of it as a “soul container” for AI characters—portable across platforms, extensible by design.

Open and modular: Build any character, swap LLMs, extend with plugins.
Web‑first architecture: Runs in modern browsers and webviews using WebGPU, WebAssembly, Web Workers, WebAudio, and WebSockets.
Native acceleration when needed: Desktop builds tap NVIDIA CUDA and Apple Metal, powered by Hugging Face and the candle project.
Mobile‑friendly by design: Already supported as a PWA (progressive web app).
Real‑time voice and motion: Client‑side speech detection, STT, and expressive Live2D/VRM animation.
Memory you control: Local, in‑browser databases with DuckDB WASM and pglite help AIRI remember what matters—without locking you into a cloud.

In short: AIRI lets you own a digital companion that can see, hear, talk, and act—without needing a server farm or vendor lock‑in.

A transparency note: no token, no crypto

To restate for clarity: Project AIRI has no official cryptocurrency or token. If anyone claims otherwise, it’s not affiliated. Always double‑check sources and exercise caution.

Architecture overview: web‑first, native‑fast

AIRI is built to run in two primary “stages.” Pick what fits your use case, then switch anytime.

Stage Web (browser/webview)

Graphics and compute: WebGPU + WebAssembly
Audio: WebAudio for input/output and client‑side VAD
Concurrency: Web Workers
Connectivity: WebSocket
Storage: In‑browser databases (DuckDB WASM, pglite) and cache layers
Why it matters: frictionless setup, cross‑platform reach, great for demos, mobile, and rapid iteration.

Stage Tamagotchi (desktop)

Performance: Native GPU acceleration via CUDA and Metal
Runtime: Powered by Hugging Face tooling and candle
Flexibility: Connects to non‑web tech (TCP, Discord voice channels, game clients)
Why it matters: lower latency, bigger models, deeper integrations

Plugin‑friendly by design

AIRI is built to be extended. The WIP plugin system lets you wire features like:

Discord voice channel presence via the Discord API
Chat bridges for Telegram and Discord
Game integrations for Minecraft and Factorio

Because plugins operate across both stages, your AI companion can travel with you—from browser to desktop to mobile—without losing capabilities.

What AIRI can do today

Here’s a snapshot of current capabilities and in‑progress features.

Brain: chat, games, memory

Chat across platforms: Telegram and Discord bridges for casual and group conversation.
Game play:
Minecraft: playable with control pipelines that can observe and act.
Factorio: work‑in‑progress with proof‑of‑concept and demo available. Uses RCON for headless server control. See Valve’s RCON protocol overview for background: Source RCON.
Memory systems: Retrieval‑Augmented Generation (RAG), in‑browser vector stores, and hybrid caches help your AI recall context and preferences.

Why this matters: your companion isn’t just answering prompts—it’s acting in real environments and remembering you.

Ears: real‑time listening

Audio input from the browser
Audio input from Discord voice channels
Client‑side speech recognition (STT)
Client‑side voice‑activity detection (VAD)

Combined, you get natural “push‑to‑talk” or always‑listening modes without extra server overhead.

Mouth: expressive speech output

ElevenLabs voice synthesis for natural, expressive output (ElevenLabs)
Configurable voices and speaking styles
Real‑time lip‑sync with Live2D/VRM models

Body: Live2D and VRM support

Live2D rig control with animations, auto blink, auto look‑at, and idle eye movement (Live2D)
VRM model control with animation and facial cues (VRM specification)
Smooth blending for eye, head, and mouth behaviors—so your AI looks alive, not robotic

Memory and local databases

Local memory reduces latency, preserves privacy, and enables offline recall:

DuckDB WASM with an ergonomic wrapper (DuckDB WASM)
pglite for in‑browser PostgreSQL‑compatible storage (pglite)
Drizzle ORM drivers for typed, maintainable schemas (Drizzle ORM)

There’s also a WIP “Memory Alaya” system for more advanced, persistent character memory.

Pure in‑browser inference

On supported devices, AIRI can run local inference via WebGPU—no server required. For bigger models, switch to desktop and tap your GPU.

LLM providers: choose your own brain

AIRI speaks to many model backends via a unified adapter (xsai). That means you can mix hosted APIs and local inference depending on cost, latency, and privacy.

Supported providers include (and are expanding):

Hosted APIs: OpenAI, Anthropic Claude, Google Gemini, Mistral, xAI, Groq, OpenRouter, Together AI, Fireworks, Cloudflare Workers AI
Local and self‑hosted: Ollama, vLLM, SGLang
Regional providers: Qwen, Tencent Cloud, Zhipu and more

Why this matters: you’re not locked into one vendor, one price point, or one model family. You can prototype with an API, then move to local inference when you want lower latency—or privacy.

How AIRI compares to other AI VTuber projects

We love—and learn from—other open projects in the space. What sets AIRI apart is the combination of web‑first reach, native acceleration, and modular memory.

Web‑native from day one: AIRI pushes the limits of what’s possible in the browser with WebGPU + WASM, while keeping a native desktop path for heavy lifting.
Memory that travels: In‑browser databases make memory portable and private. Your character can “remember you” even without persistent cloud storage.
Game‑ready: Minecraft support and a Factorio PoC show AIRI’s path toward active, embodied companions—not just chat in a text box.
Plugin ecosystem: A WIP system makes integrations for Discord, games, and tools straightforward.
Open by default: Because AIRI’s goal is to be a “container of souls,” you can bring your own character design, art, models, and voice.

If you’re familiar with projects exploring VRM, WebXR, or streaming‑focused VTubers, AIRI plays nicely with those ambitions too, and even has sibling/related efforts exploring WebXR and 3D frameworks like Three.js.

Current progress at a glance

AIRI is moving fast. Recent DevLogs include:

DevLog @ 2025.07.18 (July 18, 2025)
DreamLog 0x1 (June 16, 2025)
DevLog @ 2025.06.08 (June 8, 2025)
DevLog @ 2025.05.16 (May 16, 2025)
…and more on the documentation site

Highlights: – Stage Web and Stage Tamagotchi builds are active. – Minecraft integration is playable; Factorio is WIP with a demo. – Discord and Telegram chat bridges are online. – In‑browser memory and WebGPU inference are working. – Live2D and VRM pipelines support expressive animation.

Getting started: non‑developers

If you just want to meet AIRI and play:

Try the Stage Web build in a Chromium‑based browser for best WebGPU support. If prompted, enable WebGPU in your browser flags.
Install as a PWA on mobile or desktop for a native‑like experience.
Connect a voice: use your mic in the browser or jump into a Discord voice channel.
Pick a voice and a personality: select an ElevenLabs voice and character profile.
Say hi. Ask AIRI to introduce herself, remember your name, or join you in a game session.

Tip: On lower‑end hardware, limit background tabs and choose smaller LLMs or a hosted API to reduce latency.

Getting started: developers

You don’t need to be a specialist in everything. If you can follow docs and run pnpm, you can boot the project locally.

Install dependencies and run the Stage Web (browser) version:
pnpm i
pnpm dev
Run the Stage Tamagotchi (desktop) version:
pnpm dev:tamagotchi
Preview docs:
pnpm dev:docs
Publishing workflow:
Bump the version in Cargo.toml after running:
- npx bumpp --no-commit --no-tag

Read CONTRIBUTING.md for environment setup, desktop prerequisites, and provider configs. For GPU‑accelerated desktop builds, ensure your drivers/toolchains for CUDA or Metal are installed.

Recommended areas to explore: – LLM providers (add a new adapter via xsai) – Memory stores (enhance RAG, embeddings, schemas with Drizzle ORM) – Voice (plug in different STT/TTS stacks, or extend ElevenLabs features) – Live2D/VRM animation controllers – Game automation (Minecraft bots, Factorio via RCON) – Desktop plugins: Discord voice, local tools, TCP integrations – MCP integrations via Tauri plugin (Model Context Protocol spec: MCP)

Calling creators and researchers: we need you

Not a TypeScript or Vue.js expert? That’s okay. AIRI is a multi‑disciplinary project.

Artists and modelers:
Live2D rigging and texture art
VRM/VRChat avatar design
Real‑time graphics:
WebGPU, Three.js, WebXR
Audio and speech:
Speech recognition, VAD, diarization, TTS quality
ML/AI:
Reinforcement Learning for agents, model distillation, in‑browser inference optimization
ONNX Runtime and Transformers.js
Systems:
vLLM, Ollama, inference serving, caching
Community:
Help launch the first live stream, run Discord events, write docs, make tutorials

We also welcome React/Svelte/Solid lovers—open a sub‑directory to experiment with your preferred stack and contribute features back to AIRI.

Sub‑projects born from AIRI (a taste)

AIRI is more than a single repo. It’s a growing ecosystem:

Awesome AI VTuber: Curated list of AI VTubers and resources
unspeech: A universal endpoint proxy for speech APIs (ASR/TTS), similar to LiteLLM—but for audio
WebAI: Realtime Voice Chat—reference implementation of VAD + STT + LLM + TTS
drizzle‑duckdb‑wasm: Drizzle ORM driver for DuckDB WASM
tauri‑plugin‑mcp: Tauri plugin for MCP servers
AIRI Factorio + RCON API: Dev tools to drive Factorio headless servers
Velin: Use Vue SFC + Markdown to write stateful prompts
demodel + inventory: Model catalog and provider defaults to speed up setup
MCP Launcher: Build and launch MCP servers—like “Ollama for tools”
SAD: Docs and notes for self‑hosted and in‑browser LLMs

These building blocks keep AIRI lean while letting you compose the features your character needs.

Roadmap snapshot

What’s next on the horizon:

Memory Alaya: richer, persistent memory and long‑term persona consistency
Plugin system v1: more stable API and developer templates
Better streaming tooling: overlays, scenes, and content safety controls
More game integrations: extending beyond Minecraft and Factorio
WebXR and 3D avatars: deeper immersion via VR/AR
Latency optimizations: smarter batching, speculative decoding, GPU kernels
Safety and moderation tools: configurable filters and human‑in‑the‑loop options

If any of these excite you, we’d love your help designing, coding, testing, or documenting.

Security, privacy, and ethics

Your data, your choice: Use in‑browser databases and local inference to keep interactions private.
Transparency: No token, no pay‑to‑play gimmicks. Keep your guard up against scams.
Responsible autonomy: Game‑playing agents are powerful. Set clear boundaries and review logs when integrating with public communities.

Here’s why that matters: trust is earned. We want AIRI to be a platform you can rely on for creative expression, research, and companionship—safely.

FAQ: People also ask

Is Project AIRI a clone of Neuro‑sama?
No. AIRI is heavily inspired by Neuro‑sama’s blend of gameplay and chat, but it’s an open, modular framework you can run yourself. The goal is to let you create and own your own companion.
Does AIRI have a token or cryptocurrency?
No. There is no official token. Treat any “AIRI coin” claims as unaffiliated and proceed with caution.
Can AIRI play games like Minecraft and Factorio?
Yes—Minecraft is supported, and Factorio is work‑in‑progress with a proof of concept and demo. AIRI uses integrations (like RCON for Factorio) to observe and act in‑game.
Can AIRI run on my phone?
Yes. AIRI supports PWA installation and can run in modern mobile browsers that support WebGPU/WebAudio. For heavier models, use a hosted API or connect to a desktop instance.
Which GPUs are supported for desktop?
NVIDIA GPUs via CUDA and Apple Silicon via Metal are supported out of the box. Performance depends on your model size and provider.
Is the voice real‑time?
AIRI offers client‑side VAD and STT with fast TTS via ElevenLabs. Latency depends on your network, chosen model, and hardware. Desktop builds provide the lowest latency.
How does memory work?
AIRI uses local databases (DuckDB WASM, pglite) to store and retrieve context. A RAG pipeline helps the model recall preferences and past events. Advanced memory features are in development.
Does AIRI integrate with Twitch or YouTube?
Streaming overlays and content tools are on the roadmap. Today, AIRI supports Discord and Telegram chat bridges, with more integrations planned.
Can I bring my own model (e.g., via Ollama or vLLM)?
Yes. AIRI supports multiple providers through a unified interface (xsai), including Ollama, vLLM, and hosted APIs.
What license does AIRI use?
Check the repository for the current license and usage terms. Some sub‑projects may use different licenses.

Final takeaway

Project AIRI is a bold attempt to move AI companions from “chat windows” into real, shared experiences—games, voice, memory, and motion—across web, desktop, and mobile. It’s open, extensible, and designed for creators as much as developers. If you’ve ever dreamed of building a virtual companion who can play, talk, and learn alongside you, AIRI gives you the scaffolding to make it real.

Curious? Explore the docs, try the Stage Web build, or jump into the desktop version for maximum performance. If you’re a developer, artist, or researcher, introduce yourself—we’re building this with you.

Discover more at InnoVirtuoso.com

I would love some feedback on my writing so if you have any, please don’t hesitate to leave a comment around here or in any platforms that is convenient for you.

For more on tech and other topics, explore InnoVirtuoso.com anytime. Subscribe to my newsletter and join our growing community—we’ll create something magical together. I promise, it’ll never be boring!

Stay updated with the latest news—subscribe to our newsletter today!

Thank you all—wishing you an amazing day ahead!