Sites like English-Corpora.org or the American National Corpus (ANC) provide massive datasets for linguistic research.

Show HN: I generated 70k audiobooks with OpenAI Text-to-Speech

You can create simple text files using Notepad (Windows) or TextEdit (Mac).

If a .txt file opens in your browser instead of downloading, you can usually right-click and select "Save As" or press Ctrl+S .

A prominent recent project involved generating using OpenAI's Text-to-Speech (TTS) models.

If you are looking to download large volumes of text (around 70k files or millions of lines) for training or analysis, common sources include:

This was shared on Hacker News as a showcase of AI scalability, where the creator used LLMs to parse text and match character voices for a more immersive experience.

To convert formatted documents, select File -> Save As and choose "Plain Text" as the file type.