A2rchi v1.2.0 Release Notes
Overview
This major release includes approximately 200+ commits since v1.1.0, bringing significant architectural improvements, new features, enhanced configurability, and better developer experience to A2RCHI.
Major Features
Multi-Configuration Support
- Multiple Prompt Configurations: Users can now run A2rchi with multiple configuration files simultaneously, allowing different prompt strategies for different use cases
- Dynamic Configuration Switching: New UI dropdown allows switching between different prompt configurations on the fly
- API Endpoint for Prompts: Added
/promptsendpoint to retrieve available prompt configurations
Enhanced Data Sources & Scrapers
- Git Repository Scraper: New scraper for ingesting documentation directly from Git repositories, including support for MkDocs sites
- Authentication support with username and personal access tokens
- Automatic detection and handling of MkDocs repositories
- Configurable via
git_usernameandgit_tokensecrets
- Sources Registry System: Implemented centralized sources registry for better management of data ingestion pipelines
- SSO Scraper Improvements: Enhanced SSO scraping with better recursion handling and URL tracking in vector database
Retrieval Enhancements
- Hybrid Search with BM25: Introduced hybrid retrieval combining semantic search with BM25 keyword matching for improved accuracy
- Configurable hybrid retriever settings
- Better handling of keyword-based queries
- Document Stemming: Optional stemming of documents before embedding creation for improved retrieval accuracy
- Configurable stemming for both documents and queries
- Particularly useful for technical documentation
- Embedding Options: Enhanced embedding configuration including:
- Distance metric selection
- Custom embedding instructions
- Embedding model selection improvements
New LLM Integrations
- Ollama Support: Full integration with Ollama for local model inference
- Configurable GPU allocation (
num_gpu: -1for all available GPUs) - Support for various Ollama models
- Configurable GPU allocation (
- vLLM Improvements: Enhanced vLLM integration for faster inference
- Improved HuggingFace Support: Better handling of HuggingFace models for both inference and evaluation
Improvements
Architecture & Code Organization
- Chain/Pipeline Abstraction: Major refactoring of the chain system
- Introduced
BasePipelinefor generalized LLM and prompt initialization - Support for multiple pipelines running simultaneously
- Cleaner separation between chains, workflows, and wrappers
- Created dedicated
chains.pymodule
- Introduced
- Configuration Structure Overhaul: Completely restructured configuration file format
- More intuitive hierarchy
- Better validation and error handling
- Support for pipelines (plural) in configuration
CLI Enhancements
- New CLI Implementation: Rebuilt CLI with improved functionality
--configflag to specify configuration files-d/--dry-runmode for testing configurations without deployment-f/--forceflag fora2rchi createto automatically delete existing deployment first--print-configoption to display the loaded configuration-pflag added toa2rchi deletefor proper cleanup
- Better Secret Handling: Improved management of API keys and passwords across services
Container & Deployment
- Base Images: Created optimized base Docker images
pytorch-baseandpython-baseimages now available on DockerHub- Significantly faster deployment times
- GPU and non-GPU variants for different use cases
- Slimmer Images: Optional lightweight images without GPU dependencies when running API-based models
- Requirements Reorganization: Split requirements into multiple files for better dependency management
- CUDA Version Update: Fixed CUDA mismatch issues, now running CUDA 12.4
- OpenShift/OKD Support: Added proper permissions and configurations for Kubernetes deployments
- Health Check Probes: Implemented health check endpoints for container orchestration
Network & API
- Host Mode Support: Fixed and improved host networking mode
- Grafana works correctly in host mode
- ChromaDB respects
chromadb_external_portconfiguration in host mode
- Same-Origin API Calls: Optimized API calls from the frontend to avoid unnecessary host/port additions
- HTTPS Support: Frontend can now communicate with HTTPS APIs
- ChromaDB API Endpoints: Added REST API for ChromaDB operations
- Document listing endpoint
- Document search endpoint
- Configurable enable/disable option
- Comprehensive API documentation in user guide
Developer Experience
- GitHub Actions CI/CD:
- Added smoke tests for PR validation
- PR preview environment automation
- Automated testing pipeline
- Logging Improvements:
- Better structured logging across all containers
Interface Improvements
- Redmine Enhancements:
- Bug fixes for Redmine mailer
- Ticket client improvements
- Better Postgres integration
- Grafana Monitoring:
- Added retrieval scores to Grafana dashboards
- Better visualization of context and history
- Timeout and batch size configurations
Benchmarking & Evaluation
- Benchmarking Framework: New comprehensive benchmarking functionality
- Support for multiple evaluation LLMs and providers
- HuggingFace evaluation model support
- Better default evaluation models
- Configurable timeout, embedding model, verbosity, and batch_size
- Queries configuration via
queries.json - Plotting dependencies and visualization tools
- Dedicated benchmarking documentation
Documentation
- Complete Documentation Overhaul: Comprehensive rewrite of user and developer guides
- New User Guide Sections:
- ChromaDB API endpoints documentation
- Hybrid search documentation
- Stemming and Ollama interface documentation
- Git scraper setup and usage
- Benchmarking guide
- Vector store configuration
- Developer Guide Updates: Enhanced developer documentation with architectural diagrams
- README Improvements: Updated README with new logo, clearer instructions, and current examples
- API Documentation: Complete API endpoint documentation
- Configuration Examples: Added example configurations for common use cases
Infrastructure Changes
- Directory Restructure: Major reorganization of
src/data_manager/and related directories - Requirements Management: Split into multiple organized requirements files
- Automated Image Publishing: Script to push new Docker images to registry
- MkDocs Material: Integration with MkDocs Material for enhanced documentation
- Firefox GPU Support: Fixed Firefox compatibility for GPU-accelerated instances
- .gitignore Updates: Proper handling of .env files and .github workflows
Configuration Changes
New Configuration Options
chromadb_external_port: Configure external port for ChromaDB in host modeenable_chromadb_api: Toggle ChromaDB API endpointshybrid_search: Enable/disable hybrid retrievalstemming: Configure document and query stemmingdistance_metric: Choose embedding distance metricembedding_instructions: Custom instructions for embedding modelsollama_num_gpu: GPU allocation for Ollama modelstimeout,batch_size,verbosity: Benchmarking configurationspipelines: Support for multiple pipeline configurations (replaces singlechain)
Breaking Changes
- Configuration structure has been significantly refactored
chainconfiguration section renamed topipelines(plural)- Some configuration keys have been reorganized into new hierarchies
- Old configuration files will need to be migrated to new format
Testing & Quality
- Added smoke tests for core functionality
- Implemented GitHub Actions for automated testing
- PR preview environments for testing changes
- Improved error handling throughout codebase
- Better validation of configuration files
Resources
- [Documentation](https://mit-submit.github.io/A2rchi/)
- [User Guide](https://mit-submit.github.io/A2rchi/user_guide/)
- [GitHub Repository](https://github.com/mit-submit/A2rchi)
- [Issue Tracker](https://github.com/mit-submit/A2rchi/issues)
Full Changelog: v1.1.0...v1.2.0