Axon is an autonomous agent for Minecraft, combining a Large Language Model with the tools and context to control Baritone within a feedback loop. Give it commands in plain English, and watch it automate everything from mining resources to executing complex, multi-step plans.
Axon integrates:
- Gemini for high-level planning and natural language control
- Standard agent (tools-in-a-loop) setup with access to a chat with the user, a variety of contextual information, and a number of functions to call
- Baritone for intelligent pathfinding (navigation, waypoints, mining, farming, and more)
- Manages Baritone's state and gives the LLM access to current status and functions
- Meteor Client for player management (managing tools, food, armor, items, etc., defending against mobs, and more)
- Automatically configures a variety of modules to help the bot take care of itself
Requirements:
- Minecraft 1.21.4
- Fabric Loader 0.16.13
Put these mods in your .minecraft/mods/ folder:
Get your free Gemini API key (no payment method needed, just a Google account):
- Go to Google AI Studio
- Click "Create API key" and follow the steps
- Copy the generated API key
- In Minecraft, run
/a-key <your_api_key>
You're all set! Axon is now ready for your commands.
Use the /a command followed by your request in natural language.
/a <your request here>
If the agent ever gets stuck or confused, you can reset its memory and stop all actions with:
/a-clear
Here are a few things you can ask Axon to do:
-
Navigation
/a go to 120 -450/a go to the surface/a remember that this is home
-
Mining & Gathering
/a mine 50 iron ore/a get 64 logs, then go back to where you started/a go to Y level -54 and mine diamonds
-
Interaction & Inventory
/a follow the player named Steve/a drop the diamonds you collected/a put all your weapons in the hotbar
-
Information
/a what's in my inventory?/a what are my coordinates?/a tell me about what's going on
Here's the basic flow:
- You type a command like
/a get a stack of wood - The agent packages your command, the past conversation history, and real-time game data (your exact position, health, hunger, current biome, time of day, a complete list of your inventory)
- This information is sent to the Gemini LLM, which processes it and returns a response containing text and/or function calls.
- The agent receives the function call from the AI and executes it (like "call the
baritone_minefunction foroak_log") - The result of the action(s) are sent back to the LLM in a loop until it stops returning function calls, allowing Gemini to execute a multi-step plan sequentially or adapt to changes in the environment
