Domain Adoption & Customization Data Scientist Domain Expert Annotator Documentation Data Tool Composition Staffing and Project Management Corpus Stratifying Training Metric Compliance Domain Knowledge Transfer Create new corpus version Apply Data Augmentation Data Annotating NLP Pipeline Test Dataset Annotation Process Fork Fork Join Knowledge for annotation present? Annotation Guidelines Data Augmentation Strategies Model Metric Report Corpus Test Dataset Evaluation Dataset Augmented Training Dataset Evaluation Dataset Training Dataset Yes No

Domain Adoption and Customization

[Subprocess]

Description

Subprocess addresses the planning and performing of the tagging process to prepare examples for the training routine. The Data Scientist selects an appropriate toolchain and a representative subset of training data to train a model.