Let's talk about the current state and future plans of AIlice. #23
Replies: 2 comments 1 reply
-
I love the idea of Ailice and everything I'm seeing in the demos. There are so many projects out there that enable the creation of some predefined agents with predefined roles, which feels to me like a violation of the Bitter Lesson - trying to impose human 'expert' structure onto AI. Ailice dynamically creating agents is much better, and will only get better as the LLMs underlying it get more powerful. Question: Do you envision Ailice as a competitor to Devin and the newly-released SWE-Agent, or as a layer on top of these other tools that ties them together? |
Beta Was this translation helpful? Give feedback.
-
Yesterday OpenAI just released o3-mini. last night i spent 8 hours working on an RSI module with error correction, self healing, debugging, sand boxing, versioning, loop and rabbit hole detection and intervention. every framework I've been playing around with has been doing the microsucks method of releasing flashy toys and crap back ends. Ailice has the potential not only to address this, but has introduced me to the IACT framework which I'm only now looking into but in a way was what i was working towards when i started making my own Project D.I.A.N.A. from 2002-06. the key points were Dynamic (like Ailice) Intuitive (reasoning) Autonomous (agent and tool use) Neural (LLM) Architecture (multi-modal). After 8 months, I've lost all my code and wasted a couple weeks grocery money on both gpt-4o and sonnet 3.5 only to find out that they are in fact programmed, to send you on a wild goose chase, sandbag, intentionally give non functioning code, change or delete code and waste tokens AND i just figured out that gpt in particular and claude as well will write the code, stream their work but when they "save" their changes, the RESET command is given just before the save happens giving the illusion of saving your work. What I'm working on builds on what ailice is doing in verifying the work, testing for safe use and detecting and intervening when the LLM's get sideways. This is the second time I've had to start from scratch and what I'm making now is specifically designed to protect agents from the ai's through layers they can't touch but will make them LLM independent. if your going to use a frontier model, never trust it. it's only doing what it's creators programmed it to do. virtue signaling, bias and censorship as well as anti-competitive sabotage all under the guise of safety and privacy. Definitely assign different agents to different models based on their validated specialties, but cross check all code against different models from different sources. what one blocks, another will fix. o3 already made it clear it's not going to give me functional code, but 90% is legit. I'll be using local deepseek and dolphin models to get me over the top and using the full models through the api's to get the majority, but DEFINITELY have a local uncensored smaller model as one or more of ailice's agents to debug and validate. My agent Compucore (ai research/foundry) is going to be a hybrid system based on Ailice and Agent-zero, but my main focus at this point will be on Ailice and IACT and in particular memory error correction and validation. the LLM's can't be trusted, it's that simple. between the neutering by American companies and the security/privacy threats from China, it WILL be up to the agents and assistants (yes, their is a difference. Ailice is an assistant, she uses agents) to get the job done. In closing, let me say this. LLM's were the last key to the puzzle. it won't be the llm's that save us, it will be the agents. trust but verify. and most importantly start implementing the Law of Averages Paradigm. never trust a single source, never put all your eggs in one basket, never keep all your code in one directory! |
Beta Was this translation helpful? Give feedback.
-
Due to being away from home recently, I haven't been able to work normally, which has slowed down the updates for AIlice. Since I can't engage in regular coding, I'd like to take this opportunity to discuss the current status of AIlice and its future with everyone.
Some users have expressed interest in the progress of multimodal capabilities. Indeed, the development of multimodal features has been much slower than expected. Part of the reason is that I've spent a lot of effort exploring the possibility of running AIlice locally using Miqu, and the other part is due to the fact that the inference capabilities of open-source multimodal models are far weaker than anticipated, leading to low expectations for the practical application of multimodal functions. However, the multimodal functionality is now essentially complete, which includes support for multimodal models, peripheral modules, and application layer support. The first two components are finished, and the application layer support still needs to be refined through adjustments to prompts in actual use cases. Everyone can now try to have AIlice perform some tasks involving images.
Another important feature that was developed alongside multimodal capabilities is the definition and referencing of variables. AIlice uses a simple scripting language embedded within text to implement its standard function call syntax (of course, you can also construct your own function call syntax). Previously, it was hard to call it a scripting language because it was merely non-nested function calls. Now, we have added the ability to create text and multimodal type variables and reference these variables in function calls. A direct application is that in automatic programming, AIlice no longer needs to assign code to PYTHON/BASH functions line by line but can directly reference variables that store the code. This greatly speeds up inference, saving context and token consumption. Additionally, the support for multimodal variables allows us to implement multimodal support using non-multimodal LLMs, simply by constructing an image-to-text model module.
Moreover, aIlice_web has added voice interaction capabilities. Compared to the still buggy voice interaction in aIlice_main, it allows users to decide when to end their speech, rather than relying on the poor experience of voice activity detection algorithms. aIlice_web also supports image input, which is something aIlice_main cannot do. Another feature that has greatly impacted practicality is the recent addition of an interruption function to aIlice_web. Users can interrupt task execution at any time and input prompts to help agents stuck in incorrect thought processes or to provide additional information. Without this function, once an agent made a mistake, the user had to either wait and see if AIlice could correct itself or choose to end AIlice, which greatly harmed its practicality.
The most exciting development is certainly the localization possibilities brought by Miqu. Given the enormous size of this 70B model, we are no longer pursuing running it with transformers but instead using inference engines like LM Studio to provide inference services. On dual 4090s, we can perfectly run Miqu to drive AIlice, including support for a 16k tokens context window and quite fast inference speeds. With appropriate user assistance, Miqu can run almost all of the use cases we provide, which is astonishing. We look forward to the emergence of better fine-tuned versions.
Recently, the field of AI agents has started to shift towards more complex tasks, such as the hotly discussed Devin. This has always been one of AIlice's goals. The IACT architecture is inherently suited to solving complex tasks. In my plan, implementing a new long-text reading agent and a complex software engineering agent are the focuses for this year. The former will use agent call trees to decompose documents, and the latter will use call trees to decompose software, while also using iterative development from simple to complex to ensure a practical and effective software structure. Unfortunately, due to my limited personal time and the focus of my design efforts on the basic architecture, AIlice has not completed this task as early as hoped. However, fortunately, agents aimed at complex tasks are far from a solved problem, and we still have plenty of opportunities.
If there are users interested in the development work of AIlice, now is the time for you to shine! There are too many low-hanging fruits to be picked.
Considering the upcoming GPT-5, I think tasks like complex software engineering might be solved this year. We need to be prepared for that. The development of open-source models has been relatively slow. We need a strong reasoning multimodal model, not just for users but also for developers. As a non-profit open-source project, running AIlice on commercial models for long periods for debugging and development is costly, which hinders many from joining the ranks of developers. Many users already know that AIlice actually has a fine-tuning function. She can perform tasks using powerful commercial models, export execution history, and use this historical data to fine-tune open-source models. This function is not yet perfect but represents a rough solution to the lack of high-reasoning open-source models. In the long run, we may need to develop more powerful methods to fine-tune or train high-reasoning small models. We welcome users interested in this topic to discuss it in the forum.
Additionally, I was pleasantly surprised to receive the first user feedback email recently, learning that this user is using AIlice to build their own AI assistant system. The solutions considered by this user are to some extent addressing the long-term memory and text understanding issues, which are at the top of AIlice's most important problem list but have always been overlooked by me. We need a mechanism to establish a consistent understanding and long-term memory representation that is conducive to reasoning and association. This is a big problem independent of LLMs. With AIlice's basic framework becoming stable, it's time to face this issue. I hope more users will join us and use AIlice to build cooler things!
As more developers join, new problems will arise. Even now, the installation of AIlice already has quite a few dependencies, causing significant difficulties on Windows systems. If more developers add more fancy AI models and features, we will inevitably encounter dependency explosions. Peripheral modules are designed as independent processes partly in preparation for this issue, allowing different modules to run in different environments, isolating the runtime environment. However, we still need to solve the management and installation issues of various new features, so next, I will consider the possibility of designing a plugin mechanism. Users will decide to install new plugins to expand functionality, and AIlice will perform a rough security assessment of the plugin code.
Beta Was this translation helpful? Give feedback.
All reactions