Skip to content
  • Yann Dubois's avatar
    gitignore · 14230f34
    Yann Dubois authored
    setting up
    
    clean utils
    
    pairwise lb
    
    types
    
    initial setup
    
    initial requirements
    
    README
    
    pairwise annotator done
    
    openai done
    
    main
    
    metrics
    
    setting up empty
    
    license
    
    all prompts
    
    examples
    
    add anthropic
    
    add claude prompts
    
    minor OAI
    
    anthropic installation
    
    get_decoder
    
    get_decoder
    
    max_instances
    
    adding guanaco
    
    oasst
    
    stablelm
    
    hugging face
    
    remove langchain
    
    minor
    
    finish all decoders
    
    huggingface_local_completions
    
    huggingface_api_completions
    
    PACKAGES_ALL
    
    add opt test
    
    update packages
    
    debugging huggingface_local_completions
    
    api_completions
    
    [ENH] add timer
    
    [ENH] fast hugging face local
    
    [CONF] better default models
    
    [CONF] adding all basic conf
    
    tested all basic configs
    
    add constatns
    
    add constatns
    
    add constatns
    
    docstrings
    
    gigignore
    
    [ENH] cohere
    
    [CLEAN] use hf datasets
    
    cleaning
    
    cleaning
    
    WIP analyze
    
    fn_completions
    
    mino
    
    [ENH] return price and time per example
    
    [ENH] return price and time per example
    
    add price and time for turkers
    
    WIP agreement_of_annotations
    
    [ENH] agreement_of_annotations
    
    [ENH] add vicuna parsing
    
    finish vicuna adding
    
    [SCRIPT] add precompute script
    
    [SCRIPT] add precompute script
    
    add falcon
    
    add vicuna with inputs
    
    black
    
    [ENH] list bias
    
    [ENH] vicuna -> lmsys
    
    [ENH] vicuna -> lmsys
    
    black
    
    alpaca_farm_ppo_human_7b
    
    setup
    
    max_instances
    
    bug vicuna
    
    [ENH] analyze_evaluators
    
    clean prompts
    
    minor
    
    leaderboards
    
    make_evaluator_leaderboard
    
    rm make_evaluator_leaderboard
    
    change gpt3 to text-davinci-003
    
    [ENH] max_instances to precompute
    
    solve merging
    
    evaluator leaderboard
    
    minor
    
    add plotting
    
    add plotting
    
    rename all and finish leaderboard
    
    rm json
    
    add local models to lb
    
    add local models to lb
    
    add local models to lb
    
    add local models to lb
    
    README
    
    update the readme
    
    update the readme
    
    initial adding of constants
    
    ignore
    
    claude lb
    
    formatting
    
    add make_model_leaderboard
    
    update lb
    
    add constants
    
    minor
    
    is_return_instead_of_print
    
    save main outputs
    
    MODELS_TO_BENCHMARK
    
    update claude leaderboard
    
    leaderbaords
    
    rename
    
    minor
    
    minor
    
    minor
    
    [NOTEBOOK] compare annotators
    
    rm i.dea
    
    update readme
    
    caching
    
    prices
    
    prices
    
    gpt
    
    leadeboards
    
    instruction-following prompt
    
    minor
    
    minor
    
    rm caches
    
    leaderboard claude drop
    
    aviary
    
    aviary
    
    README
    
    aviary
    
    readme
    
    API constants
    
    API constants
    
    making new evaluator
    
    formatting readme
    
    minor
    
    Making a new evaluator
    
    minor
    
    installation
    
    developing notebooks
    
    rm unecessary
    
    ranking
    
    better error
    
    readme
    
    minor
    
    is_single_annotator
    
    leaderboard
    
    ANTHROPIC_MAX_CONCURRENCY
    
    [enh] is_save_to_leaderboard
    
    [enh] is_save_to_leaderboard
    
    imports
    
    ranking_parser
    
    ranking_parser
    
    minor rename
    
    check imports
    
    caching leaderboard
    
    caching leaderboard
    
    rename completion kwargs
    
    rohan benchmarking
    
    rm example
    
    moving to evaluators_configs
    
    single prompt
    
    remove all unecessary prompts
    
    model_configs
    
    rm all input field
    
    update readme
    
    update readme
    
    adding strip
    
    documentation
    
    [CONF] add improved configs
    
    prompts
    
    leaderboards
    
    gitignore
    
    anthropic n_retries
    
    names of models to keep
    
    hugging face inference_helper
    
    save to results
    
    constants
    
    update readme
    
    allow globing
    
    leaderboards
    
    cleaning leaderboards
    
    cleaning leaderboards
    
    package_data
    
    delete example
    
    add manifest
    
    add outputs example
    
    AlpacaEval
    
    finish developing evalset
    
    leaderboards
    
    leaderboards
    
    aviary
    
    bug alpaca farm prompt
    
    leaderboards
    
    leaderboards
    
    bias 1
    
    compare annotators
    
    notebook anntoators
    
    constants
    
    precompute
    
    allow additional columns
    
    leaderboard
    
    update lb
    
    add table of content
    
    add TOC
    
    adding more dropdowns
    
    update leaderboard
    
    update leaderboards
    
    boilerplate for website
    
    move boilerplate
    
    Create CNAME
    
    Delete CNAME
    
    AlpacaFarm -> AlpacaEval
    
    adding doc
    
    update html
    
    adding helper
    
    adding all helper to README
    
    update all leaderboards
    
    update all leaderboards
    
    smaller example of outputs
    
    add leaderboard modes
    
    udpate readmes
    
    evaluators leaderboard
    
    print_leaderboard
    
    udpate precompute
    
    constants
    
    leaderboard_mode_to_print to analyze eval
    
    update html
    
    add radio buttons
    
    udpate differences with alpacafarm
    
    update all notebooks
    
    error out
    
    003 leaderboard
    
    notebooks analyzing all
    
    analyzing_annotators
    
    finish plotting of analyzs
    
    add figures
    
    add figures
    
    dding first plot
    
    finish readme
    
    finish readme
    
    fix typos in readme.
    
    fix citation issues.
    
    fix readme.
    
    fix setup.
    
    minor.
    
    add outputs.json example
    
    fix small issues with first headline cmd.
    
    title aesthetics.
    
    title.
    
    add filters button
    
    add all model configs
    
    add results export file
    
    minor diffs
    
    prettify website
    
    udpate leaderboards
    
    finish website
    
    scoping intro
    
    scoping intro
    
    scoping intro
    
    bug fix
    
    add gpt4 full leaderboard
    
    udpate gpt4 leaderboard website
    
    add interpretation of leaderboards
    
    finish explanation of main eval metrics
    
    finish explanation of all eval metrics
    
    finish explanation of all eval metrics
    
    finish explanation of all eval metrics
    
    finish up to evaluator
    
    test
    
    test
    
    run on claude instead of gpt4
    
    add related work
    
    shorter section
    
    add limitation section
    
    add to related work
    
    add to related work
    
    finish readme
    
    update website:
    
    format dividers
    
    update readme
    
    make image bigger
    
    make image bigger
    
    add contribution guidelines
    
    typo
    
    update readmes
    
    running notebook
    
    add wizard lm
    
    change subtitle webiste
    
    add link
    
    add github
    
    update leaderboards
    
    last
    
    update
    
    finished through tatsu PR
    
    finished through tatsu PR
    
    pass through tatsu PR
    
    pass through tatsu PR
    
    add github
    14230f34
This project is licensed under the Apache License 2.0. Learn more
Loading