Analyses.org

Paper

2023·07·06 — Paper

Rogue Scores: Evaluation Reproducibility Project

Over 2,000 language modeling papers may report incorrect scores.

Note

2023·08·05 — Note

Evaluation Software Errors in the ACL 2023 Proceedings

Incorrect or irreproducible model scores found in 15% of papers.

[Updated]

Note

2023·12·18 [Updated] — Note

Evaluation Software Errors in the EMNLP 2023 Proceedings

Incorrect or irreproducible model scores found in 10% of papers.

2023

Supplement

2023·08·22 — Supplement

Reproducing the Results of Rogue Scores

Step-by-step guide for reproducing the paper.

Conference

2023·07·10 — Conference

Presenting Rogue Scores at ACL 2023 in Toronto

Frontenac Ballroom, July 10 from 11:00 to 12:30 EDT.

Memes

2023·07·09 — Memes

Call For Memes: Virtual Meme Session at ACL 2023

Inviting all attendees to submit their research/cat memes.

Talk

2023·06·22 — Talk

Invited Speaker at Northeastern University GPT Workshop

On future, ethics, and limitations of large language models.

Website

2023·06·13 — Website

Website Launch

This website now exists — click to visit!