Project Catapult: Re-thinking Bible Translation through Assisted Translation Technology
Picture a future where the Bible is effortlessly available to everyone, regardless of the language's size. This vision propels Project Catapult (PC), an initiative that integrates the achievements of established Bible translation with state-of-the-art technology. The primary objective of PC is to harness AI and other translation technologies, known as ‘Assisted Translation Technologies,’ to play a role in achieving the All-Access Goals (AAGs).
The strategy of Catapult starts with selecting 50 languages from the AAG list. These languages are then grouped into clusters based on their language families. This approach allows translation technology to learn from specific language and cluster-related data, leading to enhanced and accelerated translation efforts within these language families.
This article outlines a high-level view of the various steps of the Innovation Lab’s planned process:
Language Selection:
PC’s exclusive focus will be on AAGs languages. One important assumption to be tested is whether working with language clusters will bring efficiency. This would involve leveraging related languages to accelerate translation while minimizing duplication in the language assessment and data curation phases. This step consists of four distinct phases: identifying language clusters, analyzing, verifying, and assessing them. Progressing through these phases will form a solid foundation for advancing the data strategy component later in the overall process.
Partner Selection:
Project Catapult needs qualified, available, and aligned partners to succeed. Partners that understand the uncertainties of experimental technologies and have the infrastructure and resources to support the translation work will be vital to the success of the initiative. The goal is to build long-term partnerships with these partners across multiple language clusters.
Language Data Strategy:
The language data strategy is the engine room of Project Catapult–generating the slingshot momentum needed for translation projects. The effective utilization of Advanced Translation Technology relies on the availability of high-quality language data. The data must have a wide range of semantic and grammatical coverage and possess domain adherence to be valuable.
Languages can be classified into three categories based on the availability of linguistic data.
This affects the utilization of advanced translation technologies. Low-resource languages require focus on data collection and resource creation because it’s difficult to train accurate translation models at this stage. Medium-resource languages focus on a combination of data collection and data augmentation. For high-resource languages, the data strategy should focus on refinement, quality, and development of advanced language technologies.
Translation Team:
The Lab’s partners will be key in selecting and managing the translation teams. As the translation team is established, the Lab sees the following as being important: first they must carefully select and manage team members with the required skills, abilities, and open-mindedness toward technology. Secondly, the Lab strongly recommends training. In three of Lab’s pilot projects with Biblica, they have successfully used the Digital Training Library. Thirdly, having a translation brief and style guide in place is essential to maintain consistency, accuracy, and efficiency throughout the translation process. Lastly, translation teams are advised to include a mentor in the overall process to help them stay motivated, continuously learn, and effectively leverage technology throughout the project.
Approach and Language Model Training:
PC uses two translation approaches: suggestion-based and batch translation. For low-resource languages, the suggestion-based approach is used. It enables the creation of translation memory during the translation process, promoting consistency and efficiency. The batch translation approach can be adopted for other languages with sufficient language data resources and a well-developed language model. It generates scripture portions in batches using the well-defined language model. This contributes to accurate and contextually appropriate translations. In both of those approaches, the translation team plays a pivotal role. The aim is not to replace the translation teams but to augment their efforts using technology.
Drafting:
During the Bible translation drafting phase, the partner organization takes the lead with Lab's guidance. Editors review translated text for accuracy, internal and external reviews gather expert input, and third-party experts assess the translation. Tools like Scripture Forge (SF) and Scribe aid the process, while outsourcing to translation service providers (TSPs) like Lilt is also considered. Integrating drafting and quality assessment is vital, ensuring reader confidence for reliable and impactful translations.
Quality Assurance:
A comprehensive quality assessment approach will complement the drafting process, which will involve developing a quality matrix or grid. Quality assessment can be subjective or objective. Subjective factors are influenced by culture and society and involve translators and the community. Objective elements include quality metrics based on the original languages (Greek/Hebrew) and statistical modeling to identify inconsistencies. Project Catapult aims to achieve high-quality Bible translations that align with objective and subjective quality measures. This will be done by combining automated tools, community involvement, and the expertise of Bible translators.
Iterative Publishing:
Iterative publishing, a practice commonly used in software development and digital content creation, involves continuously refining and updating content based on user feedback and improvements. This approach, adaptable to Bible translation, allows for dynamic enhancements to the translation over time. It includes releasing smaller scripture portions progressively, refining the translation based on user input and linguistic analysis, and fostering user engagement. Additionally, it offers flexibility for different contexts, early access to scripture, collaborative translation, and higher community adoption. However, ensuring reliability and stability through quality assurance and theological oversight remains crucial.
Risks:
Launching Project Catapult involves considering potential risks: availability of language data, community engagement challenges, technology adoption resistance, balancing quality assurance tools, managing iterative publishing, selecting suitable partners, effective project management, addressing technology risks, and maintaining contextual relevance. To mitigate these risks, careful planning, ongoing evaluation, strong communication, and collaboration among stakeholders will be pivotal in achieving the project's mission of universal Bible accessibility.
Project Catapult marks the start, not the endpoint, of using Advanced Translation Technology for Bible translation. This is the beginning of an exciting new phase where translation technologies are thriving and picking up pace – and the Lab aims to make the most of this momentum. The Innovation Lab’s approach is grounded in humility and is eager to collaborate with organizations that align with the vision and goals.