Multi-LLM-Agent for taking photos
In openai developer, signup and get a key. Create a .env file containing
OPENAI_API_KEY=xxxxxxxxxxxxxxxxxxxyour_api_keyxxxxxxxx
# run relay servers
python relay_server.py --port=6000
python relay_server.py --port=6001
# run realsense camera pipeline
python rs_server.py #make sure to run this in conda env rs
# run Sony camera
open the executable Release_zoom_parameters (under 一些备份/相机exe)
# run the program
Now you can run the multi_agent_fixed_sequence.ipynb## 按照顺序跑程序
# 在teleop/video_capture_card下
python enumerate_devices.py #得到usb sony camera设备id
python opencv_stream.py #填写id,执行程序stream实时图像
# 在teleop/controller_publisher_json下
python relay_server.py #占用通信port,进行消息的relay server
python controller_publsiher.py # 发布controller消息
python robotics_arm_listener_copy.py #机械臂监听消息,从而执行action
# sony camera c++程序
#\3.21 archive\sony camera c++ file\相机exe\Release latest
执行可执行文件detailed version is in branch teleop_archive: photo_agents/file_before_archive/READEME.md
In this code, we decompose the picture taking precedure into 4 steps:
- move the camera (before and after the movement, the camera is always pointing to the object).
- Zoom in or out, decides the object ratio in the picture.
- and adjust the ISO, shutter speed (shutter speed is not supported), and aperture.
- Composition, to place the object using the law of thirds. (九宫格构图) (e.g. 在图片的x 0.33,y方向0.66)
一些测试的代码
- agent能够使用的tool,function,都放在了这个folder下面
- config放置了所有的数据,可使用的object,数据json
- high_level_tools.py放置了所有的high level函数,move camera, zoom, adjust parameters, composition
详细的信息见detailed documentation.md