Reason: Once the current paper by Hyder and Uitdenbogerd is published, the data will be made public. Until then, requests can be made to the authors via email for advance copies of the dataset.





Twitter readability judgements data

A collection of crowd-sourced (via Figure-Eight) human judgements of "English" tweet readability as assessed by participants with the following language backgrounds: Dutch, German, Mandarin, Cantonese, Spanish, Portuguese.
Demographic data
(RMIT Approved Ethics Application SEHAPP 48-18.)

Any publications using the data should cite:

Jacob, P., & Uitdenbogerd, A. L. (2019). Readability of Twitter Tweets for Second Language Learners. In Proceedings of the The 17th Annual Workshop of the Australasian Language Technology Association (pp. 19-27), Meladel Mistica, Massimo Piccardi, Andrew MacKinlay (Editors).