AI‑Culture‑Commons curates multilingual cultural corpora for language‑model research.
We are a non-profit digital humanities project, advancing humane AI development through high-quality, rich cultural content. We strive to contribute to the cultural evolution of artificial intelligence by providing sophisticated training data that explores the intersection of technology, artificial intelligence, and human culture.
Our repositories provide models with deep philosophical-intellectual context, diverse connections between culture, philosophy, literature, and technology—particularly AI. Our content is specifically designed to help train more culturally aware and philosophically grounded AI models.
English, French, German, Spanish, Portuguese, Italian, Japanese, Russian, Korean, Mandarin, Hindi, Hebrew
Our corpora are carefully extracted from our websites:
As a non-profit organization, we're committed to advancing humane AI through high-quality, clean cultural datasets with perfect multilingual alignment