{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Preparing your data for labelling\n\nThis example shows you how to go from your raw data to a project with your label\nstructure (ontology) defined, your dataset attached, and ready to be labelled.\n\n## Imports\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from pathlib import Path\n\nfrom encord import Dataset, EncordUserClient, Project\nfrom encord.orm.dataset import CreateDatasetResponse, StorageLocation\nfrom encord.project_ontology.classification_type import ClassificationType\nfrom encord.project_ontology.object_type import ObjectShape\nfrom encord.utilities.project_user import ProjectUserRole" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Authenticating\n

Note

To interact with Encord, you **need to first authenticate a client**. You can find more details\n `here `.

\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Authentication: adapt the following line to your private key path\nprivate_key_path = Path.home() / \".ssh\" / \"id_ed25519\"\n\nwith private_key_path.open() as f:\n private_key = f.read()\n\nuser_client = EncordUserClient.create_with_ssh_private_key(private_key)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Creating and populating the dataset\nThis section shows how to create a dataset and add both videos and images to the\ndataset.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Create the dataset\ndataset_response: CreateDatasetResponse = user_client.create_dataset(\n \"Example Title\", StorageLocation.CORD_STORAGE\n)\ndataset_hash = dataset_response.dataset_hash\n\n# Add data to the dataset\ndataset: Dataset = user_client.get_dataset(dataset_hash)\n\nimage_files = sorted(\n [\n p.as_posix()\n for p in Path(\"path/to/images\").iterdir()\n if p.suffix in {\".jpg\", \".png\"}\n ]\n)\ndataset.create_image_group(image_files)\n\nvideo_files = [\n p.as_posix()\n for p in Path(\"path/to/videos\").iterdir()\n if p.suffix in {\".mp4\", \".webm\"}\n]\n\nfor v in video_files:\n dataset.upload_video(v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Listing available data in the dataset\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "for data_row in dataset.data_rows:\n print(\n f\"data-hash: '{data_row.uid}', \"\n f\"data-type: {data_row.data_type}, \"\n f\"title: '{data_row.title}'\"\n )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The code will produce an output similar to the following:\n\n.. code-block:: text\n\n data-hash: '', data-type: DataType.IMG_GROUP, title: 'image-group-68dd3'\n data-hash: '', data-type: DataType.VIDEO, title: 'video1.mp4'\n\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Creating project with an ontology\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# == Creating a project containing the dataset created above == #\nproject_hash = user_client.create_project(\n project_title=\"The title of the project\",\n dataset_hashes=[dataset_hash],\n project_description=\"A description of what this project is all about.\",\n)\n\n# == Adding objects and classifications to the project ontology == #\nproject: Project = user_client.get_project(project_hash)\n\n# Objects\nproject.add_object(name=\"Dog (polygon)\", shape=ObjectShape.POLYGON)\nproject.add_object(name=\"Snake (polyline)\", shape=ObjectShape.POLYLINE)\nproject.add_object(name=\"Tiger (bounding_box)\", shape=ObjectShape.BOUNDING_BOX)\nproject.add_object(name=\"Ant (key-point)\", shape=ObjectShape.KEY_POINT)\n\n# Classifications\nproject.add_classification(\n name=\"Has Animal (radio)\",\n classification_type=ClassificationType.RADIO,\n required=True,\n options=[\"yes\", \"no\"],\n)\nproject.add_classification(\n name=\"Other objects (checklist)\",\n classification_type=ClassificationType.CHECKLIST,\n required=False,\n options=[\"person\", \"car\", \"leash\"],\n)\nproject.add_classification(\n name=\"Description (text)\",\n classification_type=ClassificationType.TEXT,\n required=False,\n # Note no `options` defined for text classifications.\n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Adding your team to the project\nTo allow annotators, reviewers and team managers to access your project, they need to\nbe added to the project by their emails (Encord accounts). You add each type of member\nby one call to the project client each:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "project.add_users(\n [\"annotator1@your.domain\", \"annotator2@your.domain\"],\n user_role=ProjectUserRole.ANNOTATOR,\n)\nproject.add_users(\n [\"reviewer1@your.domain\", \"reviewer2@your.domain\"],\n user_role=ProjectUserRole.REVIEWER,\n)\nproject.add_users(\n [\"annotator_reviewer@your.domain\"],\n user_role=ProjectUserRole.ANNOTATOR_REVIEWER,\n)\nproject.add_users(\n [\"team_manager@your.domain\"],\n user_role=ProjectUserRole.TEAM_MANAGER,\n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At this point, your data is ready to be annotated with the project-specific\ninformation defined in the project ontology.\n\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.17" } }, "nbformat": 4, "nbformat_minor": 0 }