New metric definitions for llama-3-3-70b as judge in Arena Hard benchmark #7498
Annotations
1 error
          | 
                   
                      
                          Run /./.github/actions/install-internal-pip
                        
                      
                       
                  Process completed with exit code 1. 
                         | 
              
        
      Loading