Wikidata is completely under CC0, this makes it very attractive for the project. In contains both, sentences and sometimes audio, but for this Issue I want to focus on sentences.
This Issue is work in progress, I want to collect possible sources for sentences in Wikidata:
- P5831 usage example : a example sentence for a word. Often with a language added in brackets.
- A "Description" in many languages exists for many Wikidata- items, but it isn't always a complete sentence.
The next step would be to write a script to scrap these sentences.