Media Transformations are tools we use to standardize the media you've uploaded into various Datasets. The main purpose of Media Transformations is to make your media easier to label, and thus to create better training data.
Best Practices for Transforming Media
Here are a few tips and tricks to consider when using Media Transformations.
When should I transform my media?
The scenarios differ from media type to media type, but in general, you want to use Media Transformations to make your media easier to label. The simpler you make your labeling tasks, the faster they can be completed and the better the quality will be.
Here are quick examples of how Media Transformations can help make media easier to label:
Tile large pictures into many smaller ones, so that each new "tile" can be labeled in parallel, greatly speeding up the labeling process.
Crop large pictures that include lots of extraneous pixels, reducing the total amount of imagery that needs to be reviewed and labeled.
Chunk lengthy videos into smaller clips that can be labeled in parallel, similar to how tiling works for pictures.
Where should I place the new "transformed" media?
When you perform a Media Transformation, you'll be asked to select a Target Dataset. We strongly recommend using a different Dataset than the one your original "untransformed" media are in.
Placing transformed media into a different Dataset prevents duplication within a particular Project, meaning the same pixels won't get labeled twice. When you attach a Dataset to a Project, 100% of the media in that Dataset become a part of that Project and get added to the Project's annotation queue. If you place your transformed media into the same Dataset as the original, then the original media and the transformed media will both be attached to the same Project!
For example, say you have a large satellite image over the Amazon Rainforest, and you'd like to train a model to look for deforestation. That one single satellite image might be 20,000 x 20,000 pixels in size—way too big for a single person to label all at once! It would be wise to tile the image, resulting in many hundred smaller tiles. These tiles can now be labeled in parallel.
But if you were to place the tiled images into the same Dataset as the original 20,000 x 20,000 image, you'd essentially be labeling those same pixels twice: once on the original image, and then again on each of the individual tiles! All that extra work wouldn't yield any better results when training a model, so it would essentially be wasted annotation effort!
Now let’s take a deep dive into each individual transformation we offer on the platform!
Tile Transformation
Understanding the "tile" media transformation tool
In short: the tile media transformation tool lets you chop up an image into smaller-sized images. It's sort of like using a cookie cutter to cut the larger image into more manageable ones that are all the same size. (Note: tiling doesn't apply to videos—we have “chunk” for that.)
How does the tile tool work?
When you tile an image, you will specify a width (in pixels) that you want the smaller images to be. For instance, if you're tiling an image that's 5,000 x 5,000 pixels, and you select the new width as 1,000 pixels, you'll get 25 tiled images of 1,000 x 1,000 pixels each.
Here's an illustration that helps visualize this:
The important thing to note is that the tile tool makes sure that 100% of your original image is included, so nothing falls through gaps. If you have an original image with an odd shape, the tile tool will attempt to make as many "regular" tiled images as it can, and it will use black pixels to fill in around the edges as necessary.
Here's another example to help illustrate this:
Why and when should I tile an image?
Technically, you can tile any image you have in a Dataset, but that doesn't mean that you'll want to.
Like all of our media transformation tools, tiling is generally meant to help make your images easier to label. Since you or other users in your account will need to review the image and label the feature(s) you're looking for, you'll want to format that image to make those features as easy to find as possible. Cutting up a really large image into smaller, more manageable pieces will greatly reduce the amount of time you spend looking at any single image when labeling. This is because you won't have to pan and zoom over and over again to make sure you've inspected the entire large image.
Believe me, this really does make a big difference!
Therefore, the tile tool is only really useful when you have a really big image and what you're looking for is comparatively very small. For instance, if you're trying to find sedans in a satellite image that covers half of Houston, it would take you forever to pan and scroll and zoom through that entire image! For this reason, we've noticed the tile tool is most often used for geospatial imagery (e.g. satellite, aerial), but it isn't limited to that domain.
Crop Transformation
Understanding the "crop" media transform tool
In short: the crop media transformation tool lets you crop media to a specific size or area of interest (AOI).
Note: for now, crop is only available for geo-referenced images. Support for other image types and video coming soon!
How does the crop tool work?
When you crop a geo-referenced image, you will specify a geo-referenced area of interest (AOI). The new image will be cropped to that specific AOI. Your original image is not changed, so you don't lose those extra pixels.
How to crop an image
1. Open the Dataset that contains the image(s) you want to crop.
2. Select the image(s) and click Transform, then select Crop.
Note: if you want to crop all images in a Dataset uniformly, you can simply select the entire Dataset from your Datasets table and click "Transform" to apply the crop tool. This isn't useful if your images are from different geographic areas but can be useful if you have images of the exact same area.
3. Specify a Dataset where you want your new cropped image(s) to end up. You can select the Dataset where the original image already lives, or you can select a new one. It's up to you!
4. Next, paste in an AOI in GeoJSON format. Most GIS tools support exporting a bounding box or polygon as GeoJSON, but we also recommend using geojson.io if you need to create your own manually.
Note: You only need to paste in the contents of the geometry section. Just make sure you get both the opening and closing brackets { } !
Example:
{
"type": "Polygon",
"coordinates": [
[
[
-122.45018005371094,
37.7761422535397
],
[
-122.38014221191408,
37.7761422535397
],
[
-122.38014221191408,
37.81358124698002
],
[
-122.45018005371094,
37.81358124698002
],
[
-122.45018005371094,
37.7761422535397
]
]
]
}
5. Click Crop. Your new, cropped image should appear in your target Dataset shortly!
Why and when should I crop an image?
Technically, you can crop any geo-referenced image you have in a Dataset, but that doesn't mean that you'll want to.
Like all of our media transformation tools, cropping is generally meant to help make your images easier to annotate. Since you or other users in your account will need to review the image and label the feature(s) you're looking for, you'll want to format that image to make those features as easy to find as possible.
Cropping an image with unnecessary pixels into a smaller AOI can really make a difference when it comes to labeling time. For example, cropping is best to use when you are interested in a particular area for your use case, say an airport, but your imagery covers beyond just the airport. Your labeling team will waste valuable time looking at the forest surrounding the airport when you are really just interested in the objects on the tarmac. Same goes for any geospatial use case—if you want to look into wildlife in Glacier National Park, you will want to crop all your imagery to the exact AOI of Glacier NP so your team isn’t looking at other land and finding wildlife there!
That said, cropping is related to ourtile tool. Tiling is more useful when you do still want to review and label all parts of a large image, whereas cropping is better for when you want to "rule out" entire areas of an image you know you don't need to review.
Chunk Transformation
Understanding the "chunk" media transform tool
In short: the chunk media transformation tool lets you chop up a video into shorter video clips. Think of it like the editing room at a Hollywood studio: you're taking the full length of the video and cutting it into shorter, more manageable "chunks". (Note: chunking doesn't apply to images.)
How does the chunk tool work?
When you chunk a video, you will specify a length (in seconds) that you want the shorter videos to be. For instance, if you're chunking a video that's 100 seconds long, and you select the new length as 10 seconds, you'll get 10 chunked videos of 10 seconds each.
Keep in mind: Videos are created with different frame rates—the number of frames contained in each second (e.g. 120 frames per second, or 120 fps). Using the example above, if you've chunked your 100 second video into 10 chunks of 10 seconds each, those chunked videos will contain 1,200 frames (120 frames per second x 10 seconds).
How to chunk a video
1. Open the Dataset that contains the video(s) you want to chunk.
2. Select the video(s) and click Transform, then select Chunk.
Note: if you want to chunk all videos in a Dataset uniformly, you can simply select the entire Dataset from your Datasets table and click "Transform" to apply the chunk tool.
3. Specify a Dataset where you want your new chunked videos to end up. We highly recommend selecting a different Dataset.
4. Next, choose a clip length in seconds. Remember: if your video has a very high frame rate (in fps), even a 10 second video could still contain a lot of frames! A good rule of thumb for video annotation is to try to keep videos at or under 100 frames each, when possible.
5. Click Chunk, then sit back and relax—your new, chunked videos will start appearing in your target Dataset shortly!
Why and when should I chunk a video?
Technically, you can chunk any video you have in a Dataset, but that doesn't mean that you'll want to.
Like all of our media transformation tools, chunking is generally meant to help make your images easier to label. Since you or other users in your account will need to review the image and label the feature(s) you're looking for, you'll want to format that video to make those features as easy to find as possible.
Chunking a really long video into smaller, more manageable clips will greatly reduce the number of frames you have to consider at once when labeling. This is because you won't have to scrub back and forth over and over again in the video timeline to make sure you've inspected the entire video. It also means multiple people can work on different segments of the same video in parallel, greatly speeding up your overall annotation time.
Again, believe me, this really does make a big difference!
Therefore, the chunk tool is especially useful when you have a long video that has a high frame rate. When considering whether you should chunk a video, use this simple calculation to help:
Length of original video (in seconds) x frame rate (in frames per second) = total frames
Now ask yourself: Would I be able to look at total frames and label the objects I'm looking to find all in one sitting, or would that be exhausting? If the answer is "yes, that sounds exhausting!" then you should probably chunk your video!