Notebook

Python WinRT Image Capture (and Focus Stacking)¶

Python/WinRT is a crazy thing:

The Windows Runtime Python Projection (Python/WinRT) enables Python developers to access Windows Runtime APIs directly from Python in a natural and familiar way.

With it we can directly access the Windows.Media.Capture API and take photos.

Note, I had to use Python 3.7 (not 3.8) due to a known bug:

> pip install winrt
ERROR: Could not find a version that satisfies the requirement winrt (from versions: none)
ERROR: No matching distribution found for winrt

Unfortunately Python/WinRT is no longer under active development, which is sad be cause it's really nice. Some stuff on the issue traceker claims it doesn't work, but everything mostly worked for me. The only unfixable issue I ran into was that the apartment type is immediately set to MTA and is not configurable.

The lack of an STA apartment prevented me from accessing the preview video from the camera. And I think the only way to get an exclusive lock on the camera with WinRT is to open the preview video. And I think auto focus/exposure/etc don't work without an active preview.

That said, I still successfully took photos with Python/WinRT with manual exposure settings.

And for a fun project I took images at every focus increment and fed them through a Python focus-stack package and align_image_stack.

Taking Photos¶

First, import a ton of WinRT APIs to use the camera:

In [2]:

from pathlib import Path
from math import log2

from IPython.display import Image

from winrt.windows.media.devices import (
    FocusSettings,
    FocusMode,
    ColorTemperaturePreset,
)
from winrt.windows.storage import (
    StorageFolder,
    CreationCollisionOption,
)
from winrt.windows.media.mediaproperties import ImageEncodingProperties
from winrt.windows.media.capture import (
    MediaCapture,
    MediaStreamType,
    MediaCaptureInitializationSettings,
    PhotoCaptureSource,
)
from winrt.windows.media.capture.frames import MediaFrameSourceGroup, MediaFrameSourceInfo

Then find the camera you want to use in MediaFrameSourceGroup. If you have multple cameras you'll have to pick the correct one from the list.

You need to specifically select the photo MediaStreamType, your camera can probably take both photos and video.

Yes, you really can just use Python's await on the WinRT API.

In [2]:

# Select your camera. You might have multiple if you have a front and rear camera.
CAMERA_INDEX = 0

MediaFrameSourceGroup.find_all_async()
sources = await MediaFrameSourceGroup.find_all_async()
print("Available cameras:")
for i, s in enumerate(sources):
    print(f"  {i}: {s.display_name}")
print()

source = sources[CAMERA_INDEX] 
for i, source_info in enumerate(source.source_infos):
    if MediaStreamType.PHOTO == source_info.media_stream_type:
        break
assert MediaStreamType.PHOTO == source_info.media_stream_type
print("Selected camera:", source.display_name)

Available cameras:
  0: World Facing Right
  1: Integrated Camera

Selected camera: World Facing Right

Configure the MediaCapture object to take photos with the selected camera:

In [3]:

media_capture = MediaCapture()
settings = MediaCaptureInitializationSettings()
settings.source_group = source_info.source_group
# settings.photo_capture_source = PhotoCaptureSource.AUTO
await media_capture.initialize_async(settings)

In [4]:

PROPERTIES_INDEX = 2

video_device_controller = media_capture.video_device_controller
properties = video_device_controller.get_available_media_stream_properties(MediaStreamType.PHOTO)
# I would try to inspect the properties, but I can't figure out how to make Python
# cast them to an ImageEncodingProperties object.

# Instead just hardcode it:
await video_device_controller.set_media_stream_properties_async(
    MediaStreamType.PHOTO, properties[PROPERTIES_INDEX])

Configure all the ISO and exposure settings manually. Use a very low ISO and a slower shutter speed. We want to make sure these settings don't change between photos.

I actually couldn't get any auto settings to work. I would have liked to use auto exposure detection and then "EV lock" the exposure so that it doesn't change between images.

Alternatively, you can comment out all of this configuration, launch the Windows Camera app and then close the camera app. The camera settings you set in the app should be preserved. (Maybe, that only sort of works.)

I've also left the auto configuration here commented out if you'd like to try to fix it.

In [92]:

ISO = 100
EXPOSURE = -4 # log base 2 seconds?!??! (I guess it makes them integers)
WHITE_BALANCE = ColorTemperaturePreset.TUNGSTEN

# await video_device_controller.iso_speed_control.set_auto_async()
await video_device_controller.iso_speed_control.set_value_async(ISO)

assert video_device_controller.exposure.try_set_auto(False)
assert video_device_controller.exposure.try_set_value(EXPOSURE)
# video_device_controller.exposure.try_set_auto(True)

# Manually set focus, see below for more info
await video_device_controller.focus_control.set_value_async(350)

# White balance is broken?
# assert video_device_controller.white_balance.try_set_auto(False)
# await video_device_controller.white_balance_control.set_preset_async(WHITE_BALANCE)
# video_device_controller.white_balance.try_set_auto(True)
# Color temperature:
# assert video_device_controller.white_balance.try_set_value(6500.0)

Then, take a photo!

The Windows.Storage APIs are really horrible because you're supposed to use them inside something like a UWP application container.

In [6]:

async def take_photo(filename, path=None):
    if not path:
        path = Path.cwd()
    folder = await StorageFolder.get_folder_from_path_async(str(path.absolute()))
    file = await folder.create_file_async(filename, CreationCollisionOption.REPLACE_EXISTING)
    await media_capture.capture_photo_to_storage_file_async(ImageEncodingProperties.create_png(), file)

await take_photo("test.png")
Image(filename="test.png")

Out[6]:

Incrementally Step Focus¶

Just manually set the camera's focus to capture lots of images for focus stacking. You can query the FocusControl for the minimum and maximum supported values:

In [94]:

focus_control = video_device_controller.focus_control
assert FocusMode.MANUAL in focus_control.supported_focus_modes

focus_control.min, focus_control.max

Out[94]:

(185, 365)

If your object is only in a narrow plane of focus, adjust MIN_FOCUS and MAX_FOCUS by trial and error. Otherwise set them to the minimum and maximum values of your camera.

Then, just capture photos in a loop at each focus value. This will take a really long time (increase FOCUS_STEP for faster testing).

In [99]:

MIN_FOCUS = focus_control.min
MAX_FOCUS = focus_control.max
FOCUS_STEP = focus_control.step # Small step of 1 for good focus stacking
# FOCUS_STEP = 50 # Use a big step for testing

photos_path = Path("photos")
photos_path.mkdir()

for i in range (MIN_FOCUS, MAX_FOCUS+1, FOCUS_STEP):
    print(f"Progress: {(i-MIN_FOCUS)/(MAX_FOCUS-MIN_FOCUS+1)*100:.1f}%", end="\r")
    await focus_control.set_value_async(i)
    await take_photo(f"{i}.png", photos_path)
print('Done!'.ljust(20), end="\r")

Done!

Preview the focus stack. I placed an object close to the camera for this picture to make the focus range obvious.

In [ ]:

from ipywidgets import interact, IntSlider

interact(
    lambda focus: Image(filename=f"photos/{i}.png"),
    focus=IntSlider(min=MIN_FOCUS, max=MAX_FOCUS, value=300));

Focus Stack Images¶

Use align_image_stack and enblend from the Hugin panorama tools to do the focus stacking.

In theory, we could have even captured images from the camera directly into memory and sent them off to another Python library without even writing them to disk, but Hugin is a set of an external programs so we wrote them to disk.

This step will be really slow.

In [105]:

%%sh
align_image_stack \
    -m -C \
    -a photos/aligned \
    photos/*
enfuse \
    --hard-mask \
    --exposure-weight=0 \
    --saturation-weight=0 \
    --contrast-weight=1 \
    --contrast-edge-scale=0.6 \
    --output=enfuse.png \
    photos/aligned*
magick convert enfuse.tif enfuse.png

In [5]:

Image(filename="enfuse.png")

Out[5]:

In [ ]: