File(s) under embargo
Reason: Once the current paper by Hyder and Uitdenbogerd is published, the data will be made public. Until then, requests can be made to the authors via email for advance copies of the dataset.
until file(s) become available
Twitter readability judgements data
datasetposted on 25.11.2020, 03:31 by Sandra Uitdenbogerd, Patrick Jacob
A collection of crowd-sourced (via Figure-Eight) human judgements of "English" tweet readability as assessed by participants with the following language backgrounds: Dutch, German, Mandarin, Cantonese, Spanish, Portuguese.
(RMIT Approved Ethics Application SEHAPP 48-18.)
Any publications using the data should cite:
Jacob, P., & Uitdenbogerd, A. L. (2019). Readability of Twitter Tweets for Second Language Learners. In Proceedings of the The 17th Annual Workshop of the Australasian Language Technology Association (pp. 19-27), Meladel Mistica, Massimo Piccardi, Andrew MacKinlay (Editors).