Skip to content

Core Functions

Eric Swanson edited this page Nov 15, 2023 · 13 revisions

The module corefunctions.py contains a series of functions that implement fundamental photogrammetry calculations to georectify oblique imagery onto a user-defined real world XYZ grid and merge multiple camera views, for both grayscale and color images. Additionally, this module contains functions to generate statistical image products for a given set of images and corresponding camera extrinsic and intrinsic values, as well as functions to generate pixel instruments for use in bathymetric inversion, surface current, or run-up calculations.

Georectification and Merging Multiple Camera Views

For rectification tasks, the user first initializes an XYZGrid() object. The user specifies x and y limits and resolution of the real- world grid in x and y directions. The value given for z should be the estimated water level at the time of data collection relative to the local vertical datum used in specifying extrinsic information. See XYZGrid for more info.

Next, the user initializes a CameraData() object for each calibrated camera being utilized. Each instance of this class requires all camera intrinsic and extrinsic values unique to that device. For cameras that have not yet been calibrated and the intrinsic values are not known, the user is directed to the CalTech Camera Calibration library , or other relevant calibration libraries such as the calibration functions contained on OpenCV. Intrinsic values are accepted in the CIRN convention or in the direct linear transform coefficient notation. See the CoastalImageLib User Manual for detailed information on calibration and intrinsic value formatting. The user can also optionally specify the coordinate system being utilized, with the further option of providing the local origin for a coordinate transform. See CameraData for more info.

If oblique imagery was captured using a non-stationary camera, for example an unmanned aerial vehicle mounted camera, the user is directed to the CIRN Quantitative Coastal Imaging library for calibration and stabilization. Note that this library requires stationary ground control points (GCPs) and stabilization control points (SCPs). See the CIRN Quantitative Coastal Imaging library User Manual. for detailed information on GCPs and SCPs.

The corefunctions.py function mergeRectify() is designed to merge and rectify one or more cameras at one timestamp into a single frame. For multiple subsequent frames, the user can either loop through mergeRectify() and rectify each desired frame on the same XYZ grid, or call the function rectVideos() to merge and rectify frames from one or more cameras provided in video format, sampled at the same time and frame rate. Merging of multiple cameras includes a histogram matching step, where the histogram of the first camera view is used as the reference histogram for balancing subsequent camera views. This step helps remove visible camera seams and improves the congruity of illumination.

Statistical Image Products

The corefunctions.py module also contains the function imageStats() to generate statistical image products for a given set of stationary oblique or georectified images contained in a three dimensional array, in either grayscale or color. All image product calculations are taken from the Argus video monitoring convention. The products and their descriptions are as follows:

  1. Brightest: These images are the composite of all the brightest pixel intensities at each pixel location throughout the entire collection.
  2. Darkest: These images are the composite of all the darkest pixel intensities at each pixel location throughout the entire collection.
  3. Timex: Time- exposure (timex) images represent the mathematical time- mean of all the frames captured over the period of sampling.
  4. Variance: Variance images are found from the variance of image intensities of all the frames captured over the period of sampling.

Pixel Products

The corefunctions.py module also contains the function pixelStack() to create subsampled pixel timestacks in either grayscale or color for use in algorithms such as bathymetric inversion, surface current estimation, or run-up calculations. Pixel timestacks show variations in pixel intensity over time. The main pixel products are included below, however additional instruments can be created from these main classes. For example, a single pixel, which may be useful for estimating wave period, can be generated by creating an alongshore transect of length 1.

  1. Grid (also known as Bathy Array in Holman and Stanley 2007 and other publications that reference Argus image products): This is a 2D array of pixels covering the entire nearshore, which can be utilized in bathymetry estimation algorithms.
  2. Alongshore/ Y Transect (sometimes referred to as Vbar): This product is commonly utilized in estimating longshore currents.
  3. Cross- shore/ X Transect (sometimes referred to as Runup Array): Cross- shore transects can be utilized in estimating wave runup.

Function definitions and examples

Functions are grouped by usage--core rectification functions, and pixel image and product functions--and listed in alphabetical order.

Core rectification functions

cameraSeamBlend

cameraSeamBlend(IrIndv, numcams, nc) source

This function takes rectifications from different cameras (but same grid)
and merges them together into a single rectification.

The function performs a weighted average where pixels closest to the seams
are not weighted as highly as those closest to the center of the camera
rectification.

Notes:
    - Calculates Euclidean distance to nearest non-zero pixel value
    - edt: Euclidean Distance Transform

Args:
    IrIndv (ndarray): A NxMxK matrix where N and M are the grid lengths
        for the rectified image and K is the number of cameras.
        Each k entry is a rectified image from a camera.
    numcams (int): The number of cameras. Is also the K dimension in the
        NxMxK matrix IrIndv
    nc (int): size of color channel, eg: 1=grayscale, 3=RGB

Returns:
    M (ndarray): A NxM uint8 matrix of the greyscale merged image.
        N and M are the grid lengths for the rectified image.

Example:

#This function is called from mergeRectify() as last step to merge values
#IrIndv is NxMx2 array of rectified images
numcams = 2
nc = 1
Ir = cameraSeamBlend(IrIndv, numcams, nc)
mergedImg = np.flipud(Ir.astype(np.uint8))

dlt2UV

dlt2UV(grid, cal) source

This function computes the distorted UV coordinates (UVd)  that
correspond to a set of world xyz points for a given camera m matrix
for DLT equations. Used with DLT-formatted intrinsic and extrinsic
camera parameters.

Arguments:
    cal: CameraData object containing the DLT coefficient vector A->L
    grid: XYZGrid object containing real world coords

Returns:
    DU= Nx1 vector of distorted U coordinates for N points.
    DV= Nx1 vector of distorted V coordinates for N points.

Example:

if calib.mType == "DLT":
    Ud, Vd = dlt2UV(grid, calib)

getPixels

getPixels(image, Ud, Vd, s) source

Pulls rgb or gray pixel intensities from image at specified
pixel locations corresponding to X,Y coordinates calculated in either
xyz2DistUV or dlt2UV.

Args:
    image (ndarray): image where pixels will be taken from
    Ud: Nx1 vector of distorted U coordinates for N points
    Vd: Nx1 vector of distorted V coordinates for N points
    s: shape of output image

Returns:
    ir (ndarray): pixel intensities

Example:

# Grab pixels from image at each position
ir = getPixels(image, Ud, Vd, s)

matchHist

matchHist(ref, image) source

Chris Sherwood's method of using an RBG image

Matches the histogram of an input image to a reference
image saved in self in order to better blend seams
of multiple cameras.

Arguments:
    ref (ndarray): reference image
    image (ndarray): image to match histogram

Returns:
    matched (ndarray): modified image with matching histogram

Example:

# Iterate through each camera from to produce single merged frame
for k, (I, calib) in enumerate(zip(input_frames, cameras)):
    nc = calib.nc
    # Determine if the user provided a filepath or image
    if isinstance(I, str):
        # Load image from current camera
        image = imageio.imread(I)
    else:
        image = input_frames[:, :, (k * nc): (k * nc + nc)]

    # Match histograms
    if k == 0:
        ref = image
    else:
        image = matchHist(ref, image)

mergeRectify

mergeRectify(input_frames, cameras, grid) source

This function performs image rectifications at one timestamp given
the associated extrinsics, intrinsics, and distorted images
corresponding to each camera contained within the CameraData object.
The function utilizes matchHist to match images from each camera
to the same histogram, then calls xyz2DistUV or dlt2UV to find
corresponding UVd values to the input grid and pulls the rgb pixel
intensity for each value using getPixels.

If a multi-camera rectification is desired, images, intrinsic_list,
and extrinsic_list can be input as cell values for each camera.

The function calls cameraSeamBlend as a last step to merge the values.

Inputs:
    input_frames (list OR ndarray): 1xK list of paths to images for
        each camera, OR NxMxK struct of images, one image per camera
        at the desired timestamp for rectification
    cameras (array of CameraData objects): contains:
        intrinsic_list (list): 1x11xK internal calibration data for
            each camera
        extrinsic_list (list): 1x6xK external calibration data for
            each camera
        mType: intrinsic format ('DLT' or 'CIRN')
    xyz (ndarray): XYZ rectification grid

Returns:
    Ir (ndarray): Image intensities at xyz points (georectified image)

Example:

rect_frame = mergeRectify(image_list, cameras, grid)

rectVideos

rectVideos(video_list, cameras, grid, numFrames, savefps = 'None') source

This function performs image rectifications on video files,
and saves a merged and rectified .avi to the user's drive.

Inputs:
    video_list (list): 1xK list of paths to video files for each camera
    cameras (array of CameraData objects): 1xK array of CameraData
        intrinsic_list (list): 1x11xK internal calibration data
        extrinsic_list (list): 1x6xK external calibration data
        mType: intrinsic format ('DLT' or 'CIRN')
    xyz (ndarray): XYZ rectification grid
    numFrames (int): number of frames to rectify
    savefps = frames per second to save video at, default is 'None'
        If 'None is specified, video will be saved at the same fps as the first input video

Returns:
    rect_array: h x w x numFrames array of rectified frames

Example:

numFrames = 20 # 2 fps for 10 seconds
loc = 'C:/Documents/GitHub/coastalimagelib/coastalimagelib/ExampleData/example-data-videos/'
videoFileBase = '1604152800.Sat.Oct.31_14_00_00.GMT.2020' # ExampleVideos/
video_list = [(loc + '.'.join([videoFileBase, i, 'avi'])) for i in cams]

#Call rectVideos
rect = rectVideos(video_list, cameras, grid, 5, savefps = 1)

xyz2DistUV

(grid, cal) source

This function computes the distorted UV coordinates (UVd)  that
correspond to a set of world xyz points for for given camera
extrinsics and intrinsics. Function also
produces a flag variable to indicate if the UVd point is valid. 
Used with CIRN-formatted. intrinsic and extrinsic parameter values.

Arguments:
    cal: CameraData object containing the DLT coefficient vector A->L
    grid: XYZGrid object containing real world coords

Returns:
    DU: Nx1 vector of distorted U coordinates for N points
    DV: Nx1 vector of distorted V coordinates for N points

Example:

if calib.mType == "CIRN":
    Ud, Vd = xyz2DistUV(grid, calib)

Pixel and image product functions

imageStats

imageStats(im_mat, save_flag=0, disp_flag=0) source

This function generates statistical image products for a given set of
images and corresponding extrinsics/intrinsics. The statistical image
products are the timex, brightest, variance, darkest.

All images must have the same dimensions.

Attributes:
    Bright: brightest pixels at each location across the
            collection
    Dark: darkest pixels at each location across the collection
    Timex: time- exposure (mean) of all pixels at each location
            across the collection
    Variance: standard deviation of all pixels at each location
            across the collection

Args:
    im_mat (ndarray): matrix of images
    save_flag (bool): flag to indicate if products should be saved
            automatically to the user's drive

Example:

imageStats(all_frames, save_flag = 0, disp_flag = 1)

pixelStack

pixelStack(frames, grid, cameras, disp_flag=0) source

Function that creates pixel instruments for use in bathymetric inversion,
surface current, run-up calculations, or other quantitative analyses.
Instruments can be made in both world and a local rotated coordinate
system. However, for ease and accuracy, if planned to use in bathymetric
inversion, surface current, or run-up applications, it should occur
in local coordinates.


Inputs:
    grid: XYZGrid object of desired real world coordinates
    cameras_images (array): N x M x K x num_frames struct of images,
        one image per camera (K) at the desired timestamp
        for rectification (N x M is image height x image width,
        K is number of cameras, num_frames is number of frames)
    cameras (K length array of CameraData objects): contains
        intrinsic_list (list): 1x11xK internal calibration data
        extrinsic_list (list): 1x6xK external calibration data
        origin: local origin (x,y,z,angle)
        mType (string): format of intrinsic matrix,
            'CIRN' is default, 'DLT' is also supported
    sample_freq (int): desired frequency at which to grab pixels.
        This does not factor in camera metadata. User must know frames
        per second of collection. freq=1 means grab pixels at every frame.
        For example, freq=2 means grab pixels at every other frame.
    disp_flag (bool): flag to display output image products
        coords (string): 'geo' or 'local'; if 'geo',
        extrinsics transformed to local but origin is needed to transform

Returns:
    pixels (ndarray): pixel intensities at xyz points (georectified image).
        Size depends on type of pixels. Axis 3 has the same length
        as number of frames/ sample rate

Example:

cameras = np.empty(len(cams),dtype=object)
for i in range(len(cams)):
    cameras[i] = cf.CameraData(m[i], ex[i], coords = 'local', origin= 'None', mType = 'DLT', nc=3)

#Cross- shore Transect
xMin = 100
xMax = 300
y = 1200
dx = 5
alongshore_trans = cf.XYZGrid([xMin, xMax], [y], dx, dy, z)
rect = pixelStack(image_list, alongshore_trans, cameras, disp_flag=1)