ARC (Adaptive Reasoning Core) is a foundational large language model (LLM) project designed to optimize the cost and scalability of AI model training while maintaining cutting-edge performance. This project aims to train and deploy an open-source LLM inspired by models like GPT-2 and GPT-3, with a focus on innovation and accessibility.
- Cost-Optimized Training: Innovative training techniques reduce compute costs by up to 70%.
- Hugging Face Integration: Seamless use of Hugging Face’s libraries for data processing and model training.
- Open Source: Fully transparent and community-driven, allowing contributions and collaborations.
- Scalable Design: A model architecture designed to grow iteratively from smaller-scale models (e.g., GPT-2 size) to larger-scale (e.g., GPT-3 size and beyond).
- Flexible Configuration: Easily adaptable parameters for experimenting with different model sizes, datasets, and training strategies.
- Train a foundational LLM with 200M parameters as a prototype.
- Scale the model iteratively to explore the feasibility of larger architectures.
- Share insights and methods to encourage research and innovation in accessible AI development.
The Transformer model is inspired by GPT-2, with the following customizable parameters:
- Number of Layers: Default is 6.
- Model Dimension (
d_model
): Default is 512. - Attention Heads: Default is 8.
- Feedforward Dimension (
d_ff
): Default is 2048.
We welcome contributions from the community! To get started:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Submit a pull request with a detailed description of your changes.
- Explore larger-scale models with billions or trillions of parameters.
- Integrate advanced fine-tuning methods for domain-specific tasks.
- Expand the ARC ecosystem with additional tools and features for researchers.
This project is licensed under the MIT License. See the LICENSE file for details.
- Hugging Face: For providing excellent tools and resources for NLP.