Million Book Project
The Million Book Project was a book digitization project led by Raj Reddy at Carnegie Mellon University School of Computer Science and University Libraries from 2001 to 2008. Working with government and research partners in India and China, the project scanned books in many languages, using OCR to enable full text searching, and providing free-to-read access to the books on the web., they have completed the scanning of 1 million books and have made the entire catalog accessible online.
Description
The Million Book Project was a 501 charity organization with various scanning centers throughout the world.By December 2007, more than 1.5 million books had been scanned, in 20 languages: 970,000 in Chinese; 360,000 in English; 50,000 in Telugu; and 40,000 in Arabic. Most of the books are in the public domain, but permission has been acquired to include over 60,000 copyrighted books. The books are mirrored in part at sites in India, China, Carnegie Mellon, the Internet Archive, Bibliotheca Alexandrina. The books that have been scanned to date are not yet all available online, and no single site has copies of all the books that are available online.
The million book project was a "proof of concept" that has largely been replaced by HathiTrust, Google Book Search and the Internet Archive book scanning projects.
The Internet Archive may have some books that Google does not.
The National Science Foundation awarded Carnegie Mellon $3.63M over four years for equipment and administrative travel for the Million Book Project. India provided $25M annually to support language translation research projects. The Ministry of Education in China provided $8.46M over three years. The Internet Archive provided equipment, staff and money. The University of California, Merced Library funded the work to acquire copyright permission from U.S. publishers.
The program ended in 2008. The Internet Archive hosted an online symposium in 2021 to celebrate the 20th anniversary of the Million Book Project.
Partner institutions
China
The institutions in China which are participants in this project include:- Ministry of Education of the [People's Republic of China]
- Chinese Academy of Sciences
- Fudan University
- Nanjing University
- Peking University
- Tsinghua University
- Zhejiang University
- Northeast Normal University
India
- Indian Institute of Science, Bangalore
- International [Institute of Information Technology, Hyderabad|International Institute of Information Technology], Hyderabad
- Indian Institute of Information Technology, Allahabad
- Anna University, Chennai
- Mysore University, Mysore
- University of Pune, Pune
- Goa University, Goa
- Tirumala Tirupati Devasthanams, Tirupathi
- Shanmugha Arts, Science, Technology & Research Academy, Tanjore
- Kalasalingam Academy of Research and Education, Srivilliputhur
- Maharashtra Industrial Development Corporation, Mumbai
United States
- Internet Archive
- Indiana University
- Pennsylvania State University
- Stanford University
- TriColleges
- University of California, Berkeley
- University of California, Merced
- University of Pittsburgh
- University of Washington
Europe