-
Notifications
You must be signed in to change notification settings - Fork 4
Source Data Explanation
Weikai Huang edited this page Jul 4, 2024
·
2 revisions
Task-Me-Anything comprises various types of source data, including:
- 2D images
- 3D assets
- Real images and videos with scene graphs
- Human annotations detailing angles, materials, colors, and shapes of 3D objects.
- A taxonomy that reflects the relationships between different concepts within the source data.
In this document, we provide an explanation of source data in TaskMeAnything/annotations
folders.
This 4 files contains all the annotations for the source data.
- attribute_category.json
- cateid_to_concept.json
- cateid_to_objects.json
- taxonomy.json
attribute_category.json
contains human-annotated classifications of all attributes in the SceneGraph, categorizing them into groups such as “color” and “size” for more detailed Scene Graph questions generation.
cateid_to_concept.json
, cateid_to_objects.json
, and taxonomy.json
includes all human-annotated knowledge graphs (taxonomy) and all annotations of angles, materials, colors, and shapes of 3D assets for Task-Me-Anything.
-
cateid_to_concept.json
:- Collected all concepts from Scene Graphs and 3D assets (e.g. apple, glass, dog, etc.) and normalized them to their corresponding Wikidata pages.
- For example, “eyeglass”, “eyeglasses”, “glasses”, “spectacles” in 3D assets, and “eye glasses”, “glasses” in Scene Graphs were normalized to a QID:
Q37501
which corresponding to a concept page on Wikidata: https://www.wikidata.org/wiki/Q37501."Q37501": { "surface_name": [ "eyeglasses", "glasses", "spectacles" ], "wikipedia": "Glasses", "wikidata": "glasses", "wikidata_description": "accessories that improve human vision", "objaverse": [ "eyeglass", "eyeglasses", "glasses", "spectacles" ], "scene_graph": [ "eye glasses", "glasses" ] },
-
surfaces_name
is the normalized name we use to generate questions, (e.g. There is a concept namedorange_(fruit)
in 3D assets , we normalized it toorange
assurface_name
for better readibility in question genrations).
-
cateid_to_objects.json
:- contains the annotations of angles, materials, colors, and shapes of 3D assets in each QID.
- For example, QID:
Q37501
contains 3D asset:a709eff74e544fd6b9390bb2bae0f77e
,images
means its visable prospectives in 2D stickers image scenarios,attributes
contain the color, material, shape of this 3D assets,angles
contain the visable angles of this 3D assets in 3D scenarios."a709eff74e544fd6b9390bb2bae0f77e": { "images": [ "000.png", "001.png", "002.png", "003.png", "004.png", "005.png", "006.png", "007.png", "008.png" ], "attributes": { "color": [ "blue" ], "material": [], "shape": [] }, "angles": [ 0, 120, 240 ] },
-
taxonomy.json
:- Leveraged the concept net in Wikidata to build a concept graph (taxonomy) for all concepts (QID) in Task-Me-Anything.
- Includes information like “glasses (Q37501) is a subclass of optical instrument (Q1751850)”.
- In taxonomy.json,
means Q682582 is a subclass of Q11422.
[ "Q11422", "Q682582" ],
- the
nodes
belowedges
are all the concepts (QID) that are not in the Scene Graphs and 3D assets, but helps to build the taxonomy.