Dataset Curation
Author: John (Jack) Messerly
Main Project Page
I’m aware of three ways to create an image enhancement dataset.
- Get a set of “normal” images and distort their colors or lighting. The network is then trained to “undo” these distortions in a supervised manner.
- Manually retouch all “normal” images and teach a network to “enhance” normal images in a supervised way, but this is usually impractical, given how hard it is to manually retouch an image (we’ll do an example of this further down this post)
- Get a set of “normal” images and another set of “wonderful” images that feature different subjects. Unpaired loss functions are used to teach the network how to map “normal” images into “wonderful” images, even though the pictures are all completely different scenes. This is done in tone mapping and HDR.
For fixing bad lighting and color problems, the standard way to create a dataset is to go with the first option. If we’re just trying to fix “messed up” images, nothing is limiting about this approach. ML engineers often use random scalar multiplication or nonlinear filters, such as gamma, to create color and exposure balance datasets. Pytorch offers a function called “ColorJitter” that accomplishes this task and is commonly used for dataset augmentation in tasks like semantic segmentation. While randomizing image channels is an effective way to add “underexposed” and “desaturated” images, it has its limitations when it comes to simulating color distortion. Here’s an example of how this methodology can be used to create a color and exposure balance dataset.
Distorting a Dataset (Naively)
Basic Imports
import torch
from torchvision.transforms import CenterCrop, Resize, ColorJitter
import torchvision.transforms.functional as TF
import numpy as np
import requests
from PIL import Image
import matplotlib.pyplot as plt
import cv2
url = 'https://images.pexels.com/photos/276299/pexels-photo-276299.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=1'
image = Image.open(requests.get(url, stream=True).raw)
Original Image
IMAGE_DIM=300
width,height = image.size
dim = min(width,height)
transforms = torch.nn.Sequential(
CenterCrop((dim,dim)),
Resize((IMAGE_DIM,IMAGE_DIM)),
)
original = transforms(image)
original
Distorted with Scalars
To create a distortion dataset, a common method is to take “regular” images and manipulate their red, blue, and green channels. This allows for the training of machine learning algorithms to fix these distortions. One approach is to multiply the channels and then clip/normalize the values back to 255. Alternatively, a gamma can be used to avoid exceeding the 255 limit. It’s worth noting that gray world assumption algorithms are effective in removing this type of distortion.
data = np.array(original).astype(float)
# some scalars to warm up the color temperature
red = 1.2
blue = 0.8
data[:,:,0] *= red
data[:,:,2] *= blue
data = np.clip(data,0,255)
data = Image.fromarray(data.astype(np.uint8))
# a classic recipe for warming up the lighting in an image
data
Distorted with Gamma
To adjust the color temperature nonlinearly, you can perform “gamma correction” on your different channels. This already succceeds in creating distortions that cannot be removed by a “gray world algorithm”.
data = np.array(original).astype(float)
data /= 255
# Distort the colors nonlinearly so that values don't go out of bounds
red = 1 / 1.5
blue = 1 / 0.8
data[:,:,0] = data[:,:,0] ** red
data[:,:,2] = data[:,:,2] ** blue
data *= 255
data = Image.fromarray(data.astype(np.uint8))
# Warmer, but in a weird non-linear way: the grass is more red than the sky
data
Distorting Saturation
# Convert to Hue, Saturation and Value Space
data = np.array(original)
data = cv2.cvtColor(data,cv2.COLOR_RGB2HSV).astype(float)
data = data / 255
desaturation = 0.5
data[:,:,1] *= desaturation
data = data * 255
data = np.clip(data,0,255).astype(np.uint8)
data = cv2.cvtColor(data,cv2.COLOR_HSV2RGB)
data = Image.fromarray(data)
# Observe a washed out / uncolorful version of our original meadow
data
Distorting the Hue
You can mess with the hue of an image this way as well, but I only suggest doing this for dataset augmentation purposes (if you’re training an image classifier or something unrelated). If you augmented an image enhancement dataset this way, you might as well be learning how to “colorize” black and white images. Your network woud overfit and not be generally useful.
# Convert to Hue, Saturation and Value Space
data = np.array(original)
data = cv2.cvtColor(data,cv2.COLOR_RGB2HSV).astype(float)
data = data / 255
hue = 0.5
data[:,:,0] *= hue
data = data * 255
data = np.clip(data,0,255).astype(np.uint8)
data = cv2.cvtColor(data,cv2.COLOR_HSV2RGB)
data = Image.fromarray(data)
# What have we done?
data
Limitations of Perturbing RGB/HSV Channels Randomly
- Too Easy or Too Hard: When adjusting channels with multiplicative scalars or gamma, you tend to create datasets with either simple distortions, or extreme ones. Being able to remove a warm or cool tone isn’t particularly impressive, even if your network learns to solve this problem well. But if you include images with drastic changes to the hue, or apply highly nonlinear filters to the channels independently, you can hinder the network from generalizing well on anything you’d encounter in real life.
- Data Inefficient: When creating a dataset for this purpose, it’s unnecessary to evenly simulate all potential color imbalances. Ideally, you want to focus on lighting distortions that you might actually find in the real world. Randomizing your color distortions will build a dataset to fix lighting problems that are rare or impossible.
Better than randomizing, is to handcraft a distortion dataset that is populated with images that are both:
- Complex
- Likely to be found in the real world, or in edited photos online
We will now turn our conversation to Look Up Tables (LUTs).
Introducing Look Up Tables (LUTs)
The term “LUT” refers to “Look Up Table,” which is essentially a set of discrete functions that operate like hash tables. When you apply a LUT to an image, you scan through each pixel in the image, locate its corresponding coordinate in the LUT, and modify the color of that pixel according to the information provided in the LUT.
A 3D LUT is a mapping technique that assigns a new color to every possible pixel color. In the example provided, winter tones have been mapped to pastels for demonstration purposes. However, creating a LUT that maps every pixel would result in an excessive number of elements (255x255x255 = 16581375). Therefore, LUTs are typically created with about 33x33x33 uniformly spaced bins, and some values are manually filled while the remaining ones are set using techniques like trilinear interpolation.
In the world of cinema, 3D LUTs are used during post-processing to give each movie a unique atmosphere. For instance, horror movies often use LUTs that map grey values to blue or green, darken shadows, and desaturate faces. Meanwhile, Wes Anderson films tend to map grey values to yellow, lighten the overall appearance, increase the saturation of oranges, and map blue to teal (although there is an ongoing community effort to design the perfect Wes Anderson LUT filters). In my opinion, the LUT below creates an image that would fit in a true crime series or cop drama.
1D LUTs for Shadows, Midtones and Highlights
A 1D LUT impacts one channel independently and is easier to create and manage than 3D LUTs. To illustrate, I have included an image of editing a lookup table using the free software GIMP. By accessing the Colors tab and selecting the Curves menu, I adjusted the middle of the “red” channel, which created a “curve” that maps old intensity (x-axis) to new intensity (y-axis). The outcome of this operation is that all “middle” intensities of red are now mapped to a higher intensity. This is an example of a 1D LUT since it exclusively affects a single channel.
Curves that capture the effects you want
1D LUTs can add a lot to your dataset. But as you can see, increasing the red channel alone did not create a convincing “warm” feel to my image. In fact, just applying a midtone filter to one channel is basically equivalent to gamma adjusting that channel. When building a LUT dataset, you have to get a little creative. Something that works a bit better is to subtract out blue when you want to create a warm tone or subtract out red when you want to create a cool tone. In the below image, I’ve reduced the values of the mid to high-level blues, to create a warmer look. I’ve left the darker parts of the blue channel alone, as these are unrelated to “lighting” effects. To create a convincing “neon” look, I would advise subtracting out green and then adding in either red or blue. I’ve modified all the tones slightly because artificial neon lights tend to change the value of all the pixels in a scene (shadows, midtones, and highlights).
Contrast
Creating 1D LUTs that adjust contrast is a simple process. When building a dataset, it’s recommended to do so to prevent the network from removing all color contrast in the image. It’s important to note that I have chosen the “value” channel, which affects the lightness and exposure level.
Saving our 1D LUTs as .cube files and applying them in python
To apply a color mapping to any image, you can convert it into a .cube file using a HALD clut file. These files, available for download at https://sirserch.github.io/lut-creator-js/, contain every possible color with some resolution. You can use software like Python, GIMP, or Photoshop to alter the colors in this file, and then simple scripts can turn them into .cube files. You can find step-by-step instructions for this process on a webpage. It’s not necessary to download any additional software, but searching for “how to create LUTs” may result in spammy ads.
1D LUTs to Include in your Distortion Dataset
To achieve the desired effects, I suggest creating a small set of random color curves with no more than 20 entries using the method mentioned earlier. Aside from adjusting the midtone gamma, keep in mind that green highlights and blue shadows are common, especially under fluorescent lighting. It’s important to make sure your 1D curves are diverse enough to cover a wide range of colors, but I advise against generating them randomly as this may result in odd images that cause your network to overfit. For each image, only a few LUT applications selected at random from your larger set are needed to create a sufficient dataset for removing 1D LUT distortions. On the Github page for this project, I’ve included some 1D LUTs I made using GIMP that should be helpful. Here are some examples of the LUTs I created for you:
- orange
- extreme red
- slight red
- warm
- yellow
- cold
- extreme blue
- blue teal
- storybook (green / yellow lighting mix)
- flourescent (green highlights)
- green midtones
- green teal
- purple majesty (+ blue - green)
- neon (+ red - green)
- just purple (+ red + blue)
- desaturated
- technicolor
- dark shadows
- over exposed
- saturated highlights
- underexposed
# To apply our lut file, we can use the pillow-lut library
# found at https://pypi.org/project/pillow-lut/
#
# A word of caution here: pillow isn't compatible with older versions
# of PIL, which some of the functions we need in HuggingFace's diffusers
# library relies on. I would suggest making a virtual environment with
# the following script command:
#
# python3 -m venv lut_env/
# cd lut_env/
# source bin/activate
#
# and perform all of your lutting here.
# In order to apply a LUT file to an image, we can use the
# following code
from pillow_lut import load_cube_file
image = PIL.open('my_boring_image.png')
lut_file = 'cool_lut_I_made_in_GIMP.cube'
lut = load_cube_file(lut_file)
better_or_worse_image = image.filter(lut)
# "better_or_worse_image" now has whatever changes are dictated in your
# LUT file, which reflects how you changed your HAL file in GIMP
My Homemade 1D LUTs (and some contrast/exposure adjusters)
I created these files using 64 resolution HAL format. The folders contain an equal number of distorters, making it a good mix. To keep things balanced, I suggest randomly selecting them when creating a dataset. Click on this link to access the folder
Limitations of 1D LUTs
Creating “brown” or vintage VSCO/Instagram filters using 1D curves can be quite challenging. Additionally, it is not possible to adjust saturation levels through 1D LUTs since saturation and hue depend on all three colors. However, adjusting saturation is a straightforward process programmatically. Decreasing saturation results in a less colorful image while increasing it makes the image more vivid. Changing the hue, on the other hand, is more difficult since it involves altering the fundamental colors of the image. This process requires a lot of skill and precision. Some people sell high-quality “color mappings” online for over $20 because creating good color maps is a challenging task. These color maps, known as 3D LUTs, must accurately map all possible pixels to different ones without making the image look worse.
Playing with 3D LUTs in GIMP and GMIC
Firstly, I suggest avoiding the creation of a complex 3D LUT in GIMP, particularly if you want to use it for real photo retouching. Instead, I recommend searching for websites that offer free LUTs and using them to alter your images. As a helpful resource, I have gathered some free LUTs for you, which can be found on this page’s GitHub. However, since GIMP has a Creative Commons License, I will demonstrate how to create a 3D LUT inexpensively using it.
3D LUTs the Simple Way (HaldCLUT file)
Remember that a 3D LUT is a table that links RGB coordinates to new ones. Although creating one that works for all images requires skill, you can easily make one by editing a HaldCLUT file using various sliders, not just the 1D curves. GIMP’s color and filter tabs contain many sliders, but the Hue-Saturation panel in the Colors tab is a versatile option.
Even with this simple slider, you can create some interesting cinematic effects. Consider the image of a model below. On the top is an original image, and on the bottom is its counterpart with an “Orange and Teal” look mapped to it.
Disclosure: I actually found this image in its edited state, and used the Hue-Saturation tool to “recover” what I believe is the original image. I believe it had its blues mapped to cyan, its reds mapped to orange, and the shadows darkened. This is basically a common edit. Using the Hue-Saturation tool and 1D value curves, I think I did a pretty good job of extracting the “before”. You can tell that the “after” image is edited because of the harsh transition on the “Genstar” banner: there is an unnatural blue highlight that divides it from the rest of the image. Sometimes, applying LUTs to enhance images will result in artifacts like these. For what it’s worth, I think my fix looks nicer than the heavily post-processed one.
Editing 3D Cubes with GMIC
GIMP offers some enjoyable sliders to experiment with, but when you feel like you’ve exhausted the Hue-Saturation and Curves tools, it’s time to step up to GMIC. GMIC is a filter pack that has been developed by numerous individuals, and its standout feature is the Customize CLUT tool. To access this tool:
- Download GMIC here. It isn’t its own application. Once installed, it can only be used through GIMP
- Open GIMP, head over to the filters tab and select G’MIC-Qt
- In the G’MIC menu, go to the Colors tab and scroll down to Customize CLUT
Tips for editing a full 3D LUT
In the CLUT Customize menu, you’ll find buttons that you’re likely already familiar with for adjusting brightness, contrast, and more. However, the true strength of this tool lies in its “Action” items. Each action includes a source and target color and three available options.
- Ignore (you can turn rules on and off quickly this way)
- Lock Source: you can pick colors in your image and lock them, so the LUT learns to not change them. Right away, you should pick some white values in your image (snow, etc.) and lock them so you don’t get a global color imbalance.
- Replace Source by Target: This is your main weapon. In the above example, I selected the red on the jacket and replace it with orange, and replaced the watercolor with a more icy teal.
I am impressed with the tool’s ability to regress a LUT from given constraints, although I am uncertain about the algorithm used. However, I suspect that trilinear interpolation, as described in Hui Zeng et al’s work on differentiably learning LUTs, may be involved. You can find more information on this topic at https://github.com/HuiZeng/Image-Adaptive-3DLUT.
Finding LUT’s online
Retouching is tedious, and making good LUTs is hard. These examples have been simple, but if we took any of these 3D LUTs we made and applied them to any image, it would probably be terrible. Real artists spend many hours designing 3D LUTs to work well with a wide range of imagery. Fortunately, there are websites with lots of free LUTs that create a wealth of cinematic effects. I recommend downloading free “.cube” files at freshluts.com. You’ll have to sign up for a free account, but I’ve used them for a while and I don’t find them sketchy. For dataset generation, I recommend searching them by tint (red, blue, green, magenta, brown), and just downloading a whole bunch of them.
Some Handpicked 3D LUTs
I’ve added a section of artistic LUTs to match the damaging 1D LUTs discussed above. Admittedly, these ones come from online. I’m not confident enough in my own 3D LUTs to share them with the general public for dataset generation, as I don’t know how well they generalize over hundreds or thousands of images. Each folder (that is labeled with a color) contains 3 LUTs: one that greatly changes the vibe of the image, one that slightly retouches it, and a medium one. The other folders have experimental LUTs, cinematic effect LUTs, and VSCO/Instagram-like filters that I also used to spice up my own dataset. A link to this folder is here
Where to find your images
I recommend anywhere between 5,000 to 15,000 total images (all distortions included). Although you can get very good results on just 3,000 distorted images created from 500 “true” images. I used images from:
-
The Ava Aesthetics Dataset: A public dataset of images curated from DP Challenge. The dataset is here , and a paper describing it is here
-
MS Coco: Tried and true, Coco 2017 was also used to train the autoencoder used in this project. The dataset is here. Note that as of 2023, the only reliable way to access this dataset is with wget
-
PPR10K Portrait Dataset: The only face dataset that I like for this problem, and also a project worked on by the same people who authored the seminal “Learning Image-adaptive 3D Lookup Tables for High Performance Photo Enhancement in Real-time”. Link to the portrait dataset
Basic Dataset Curation
To distort your images, I suggest organizing the LUTs into categories such as red, blue, and contrast. For each original picture, apply a distortion from each category by randomly selecting a LUT flavor. Make sure to balance the red, blue, and green distortions while providing plenty of variation.
Below are some of the images from the time I spent living in Vermont (although on a second glance, I think all those images are technically in New Hampshire). They are distorted with a splatter of LUTs. Notice that some are obviously warm or cool, while others are more complex color mappings. Absent, however, are random colors that wouldn’t normally appear in a modern digital photograph, such as you would get by just applying color transformations by distorting images in a python loop.
General Enhancement LUTs (for skin tones)
There are LUTs available that can generally enhance any image. You can purchase well-engineered LUT packs from places like Colorist Factory for around $80. These LUT sets have a “grade” for every possible image you could use them on. Colorist Factory’s blog is a great resource to learn about professional color design, but be aware that it may be more complex than this amateur post.
On the free website mentioned above, you can find a free LUT called Severn that enhances images while giving them a cooler tone. Another LUT called Nikon Z6, on the other hand, warms up the images. It’s worth noting that even though Nikon Z6 is better at warming up faces, Severn still manages to warm up the model’s face while maintaining a cool overall tone. By using a combination of these LUTs, I was able to improve the faces dataset that I used to train my network. This helped prevent skin tones from appearing washed out after balancing the other colors in the image.
Please note that I neutralized the colors of this image before applying these enhancements. This was done to achieve a more natural look for both Nikon and Severn. If your image already has high contrast or unusual lighting effects, using an enhancement LUT may result in an even stranger appearance. Therefore, it’s important to review your images in the distorted dataset prior to including them. While they don’t need to be perfect, any images that look particularly awful should be discarded.