{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n# DEPRECATED - Reading project labels\n\nThis tutorial introduces a deprecated script to read labels from you Encord project. You are encouraged to\nuse the tools introduced in the Working with the LabelRowV2 section instead.\n\n## Imports and authentication\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from dataclasses import dataclass\nfrom functools import partial\nfrom pathlib import Path\nfrom typing import Callable, Generator, List, Optional\n\nfrom encord import EncordUserClient\nfrom encord.orm.project import Project as OrmProject\nfrom encord.project import Project\nfrom encord.project_ontology.object_type import ObjectShape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "


To interact with Encord, you need to authenticate a client. You can find more details\n `here `.

\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Authentication: adapt the following line to your private key path\nprivate_key_path = Path.home() / \".ssh\" / \"id_ed25519\"\n\nwith private_key_path.open() as f:\n private_key = f.read()\n\nuser_client = EncordUserClient.create_with_ssh_private_key(private_key)\n\n# Find project to work with based on title.\nproject_orm: OrmProject = next(\n (\n p[\"project\"]\n for p in user_client.get_projects(title_eq=\"The title of the project\")\n )\n)\nproject: Project = user_client.get_project(project_orm.project_hash)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. The high-level view of your labels\nProject data is grouped into label_rows, which point to individual image groups or\nvideos. Each label row will have its own label status, as not all label rows may be\nannotated at a given point in time.\n\nHere is an example of listing the label status of a label row:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Fetch one label row as an example.\nfor label_row in project.label_rows:\n print(label_row)\n break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Expected output:\n\n```\n{\n \"label_hash\": \"\", # or None\n \"data_hash\": \"\",\n \"dataset_hash\": \"\",\n \"data_title\": \"\",\n \"data_type\": \"IMG_GROUP\", # or VIDEO\n \"label_status\": \"NOT_LABELLED\",\n \"annotation_task_status\": \"ASSIGNED\"\n}\n```\nFrom the high-level data, you can, for example, compute some statistics of the\nprogress of your annotators:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "status_counts = {}\nfor label_row in project.label_rows:\n status = label_row[\"annotation_task_status\"]\n status_counts[status] = status_counts.setdefault(status, 0) + 1\nprint(status_counts)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Expected output:\n\n```\n{'RETURNED': 1, 'COMPLETED': 3, 'QUEUED': 20, 'IN_REVIEW': 3, 'ASSIGNED': 1}\n```\n## 2. Getting all label details\nThe actual labels in the label rows are fetched by\n:meth:`.EncordClientProject.get_label_row()`. This function will return a nested\ndictionary structure, with all details about classifications as well as objects.\nIn this section, we show how to build a list of all bounding boxes that have been\nreviewed and marked as approved.\n\nFirst, we define a data class to hold the information of interest.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "@dataclass(frozen=True)\nclass AnnotationObject:\n object: dict\n file_name: str\n data_url: str\n frame: Optional[int] = None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we define a function which iterates over all objects of a label row fetched with\n:meth:`.EncordClientProject.get_label_row()`. The function has a callable argument\nused to filter which objects should be returned.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def iterate_over_objects(\n label_row_details,\n include_object_fn: Callable[[dict], bool],\n) -> Generator[AnnotationObject, None, None]:\n \"\"\"\n Iterate over objects in a label row.\n\n :param label_row: the detailed label row to fetch objects from\n :param include_object_fn: A callable indicating whether to include an object.\n :return: Yields AnnotationObjects.\n \"\"\"\n if label_row[\"data_type\"] == \"IMG_GROUP\":\n # Image groups have multiple data_units (one for each image)\n for data_unit in label_row_details[\"data_units\"].values():\n url = data_unit[\"data_link\"]\n file_name = data_unit[\"data_title\"]\n objects = data_unit[\"labels\"][\"objects\"]\n for object in objects:\n if include_object_fn(object):\n yield AnnotationObject(object, file_name, url)\n\n else:\n # Videos have a single data unit, but multiple frames.\n # Need to iterate through frames instead.\n data_unit = list(label_row_details[\"data_units\"].values())[0]\n\n url = data_unit[\"data_link\"]\n file_name = data_unit[\"data_title\"]\n for frame, labels in data_unit[\"labels\"].items():\n for object in labels[\"objects\"]:\n if include_object_fn(object):\n yield AnnotationObject(object, file_name, url, frame)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we can define a function, which is used to choose which objects to include.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def include_object_fn_base(\n object: dict,\n object_type: Optional[ObjectShape] = None,\n only_approved: bool = True,\n):\n # Filter object type\n if object_type and object[\"shape\"].lower() != object_type.value.lower():\n return False\n\n # Filter reviewed status\n if (\n only_approved\n and not object[\"reviews\"]\n or not object[\"reviews\"][-1][\"approved\"]\n ):\n return False\n\n return True\n\n\n# Trick to preselect object_type\ninclude_object_fn_bbox: Callable[[dict], bool] = partial(\n include_object_fn_base, object_type=ObjectShape.BOUNDING_BOX\n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can use the iterator and the filter to collect the objects.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "reviewed_bounding_boxes: List[AnnotationObject] = []\nfor label_row in project.label_rows:\n if not label_row[\"label_hash\"]: # No objects in this label row yet.\n continue\n\n # Only set the `include_reviews` flag to `True` if the reviews payload is needed.\n label_row_details = project.get_label_row(\n label_row[\"label_hash\"], include_reviews=True\n )\n reviewed_bounding_boxes += list(\n iterate_over_objects(label_row_details, include_object_fn_bbox)\n )\n\nprint(reviewed_bounding_boxes)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Expected output:\n\n```python\n[\n AnnotationObject(\n object={\n \"name\": \"Name of the object annotated\",\n \"color\": \"#FE9200\",\n \"shape\": \"bounding_box\",\n \"value\": \"name_of_the_object_annotated\",\n \"createdAt\": \"Wed, 18 May 2022 14:07:14 GMT\",\n \"createdBy\": \"annotator1@your.domain\",\n \"confidence\": 1,\n \"objectHash\": \"\",\n \"featureHash\": \"\",\n \"manualAnnotation\": True,\n \"boundingBox\": {\n \"h\": 0.8427,\n \"w\": 0.5857,\n \"x\": 0.3134,\n \"y\": 0.1059,\n },\n \"reviews\": [\n {\n \"exists\": True,\n \"comment\": None,\n \"approved\": True,\n \"instance\": {\n \"name\": \"nested_classifications\",\n \"range\": [[0, 0]],\n \"shape\": \"bounding_box\",\n \"objectHash\": \"\",\n \"featureHash\": \"\",\n \"classifications\": [],\n },\n \"createdAt\": \"Wed, 18 May 2022 14:07:42 GMT\",\n \"createdBy\": \"reviewer1@your.domain\",\n \"rejections\": None,\n }\n ],\n },\n file_name=\"your_file_name.jpg\",\n frame=None, # or a number if video annotation,\n data_url=\"\",\n )\n # ...\n]\n```\nFrom this template, it is possible to extract various subsets of objects by changing\narguments to the ``include_object_fn_base``. For example, getting all polygons is done\nby changing the ``object_type`` argument to :class:`.ObjectShape.POLYGON`.\nSimilarly, you can define your own filtering function to replace\n``include_object_fn_base`` to select only the objects that you need for your purpose.\nFinally, if you want to get classifications rather than objects, you will have to\nchange the ``\"objects\"`` dictionary lookups (line 129 and 141) to\n``\"classifications\"`` and compose a new filtering function.\n\n## 3. Fetching nested classifications\nIt is possible to make nested classifications on objects. The information about such\nnested classifications is stored in the ``classification_answers``, ``object_answers``\nand ``object_actions`` sections of the ``label_row_details``.\n\nAssuming that the reviewed bounding boxes fetched above have nested attributes, the\nfollowing code example shows how to get the nested classification information.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "print(\n label_row_details[\"object_answers\"][\n reviewed_bounding_boxes[-1].object[\"objectHash\"]\n ]\n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Expected output:\n\n```\nExpected output:\n{\n 'classifications': [\n {\n 'answers':\n [\n {\n 'featureHash': '',\n 'name': 'nested option 1',\n 'value': 'nested_option_1'\n }\n ],\n 'featureHash': '',\n 'manualAnnotation': True,\n 'name': 'Nested classification.',\n 'value': 'nested_classification.'\n }\n ],\n 'objectHash': 'e413a414'\n}\n```\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.16" } }, "nbformat": 4, "nbformat_minor": 0 }