Reference / Nodes

Vision

Computer-vision nodes that analyze the image and hand you back data. They run Apple’s Vision and CoreML frameworks on a background queue, pass the source through untouched, and expose results both as overlay textures and as float ports you can route into any parameter.

Detection is throttled and smoothed (exponential moving average) so the tracked values glide at full render rate without jitter. Tuning fields like confidence and detection rate live in the inspector; the ports below are the connectable inputs and outputs.

In the live preview these run asynchronously, so the result can trail the image by about one inference. Each node has a Synchronous (frame-accurate) toggle to lock it to the current frame, and during a Render Timeline export they are always frame-accurate.

Face Detection

faceDetectionVision

Detects faces in the image and emits a box overlay, an alpha mask, and per-face position/size data.

Inputs

Port	Type	Default	Range	Notes
texture	texture	,	,	Image to analyze.

Outputs

Port	Type	Default	Range	Notes
texture	texture	,	,	Source passthrough (zero-copy).
overlay	texture	,	,	Transparent canvas with bounding boxes.
mask	texture	,	,	White inside face regions, transparent outside.
faceCount	float	,	,	Number of detected faces.
face1X	float	,	0 – 1	Center X of the largest face (0.5 if none).
face1Y	float	,	0 – 1	Center Y of the largest face.
face1Size	float	,	0 – 1	Normalized long-edge size.

Runs VNDetectFaceRectangles (~5–15 ms/frame on Apple Silicon, no model file needed). Composite overlay back over the video, use mask to isolate or blur faces, or drive effects from the largest face’s position. Box and mask styling, confidence and max-face count are inspector fields.

Pose Tracking

poseTrackingVision

Tracks a human skeleton and emits a passthrough, a skeleton overlay, a detected flag, and X/Y ports for 13 joints.

Inputs

Port	Type	Default	Range	Notes
texture	texture	,	,	Image to analyze.

Outputs

Port	Type	Default	Range	Notes
texture	texture	,	,	Source passthrough.
overlay	texture	,	,	Transparent canvas with bones + joints.
detected	float	,	,	1 if a body was found, else 0.
headX / headY	float	,	0 – 1	Nose joint, normalized, top-left origin.
leftShoulderX / Y	float	,	0 – 1	Left shoulder.
rightShoulderX / Y	float	,	0 – 1	Right shoulder.
leftElbowX / Y	float	,	0 – 1	Left elbow.
rightElbowX / Y	float	,	0 – 1	Right elbow.
leftWristX / Y	float	,	0 – 1	Left wrist.
rightWristX / Y	float	,	0 – 1	Right wrist.
leftHipX / Y	float	,	0 – 1	Left hip.
rightHipX / Y	float	,	0 – 1	Right hip.
leftKneeX / Y	float	,	0 – 1	Left knee.
rightKneeX / Y	float	,	0 – 1	Right knee.
leftAnkleX / Y	float	,	0 – 1	Left ankle.
rightAnkleX / Y	float	,	0 – 1	Right ankle.

Runs VNDetectHumanBodyPose on the first detected body. Each of the 13 joints is exposed as a separate ...X and ...Y float port (the table pairs them for brevity) - normalized 0–1 with a top-left origin and EMA-smoothed. Drive instancers, particles or distortions from a wrist or the head, or composite the skeleton overlay for a motion-capture look.

Optical Flow

opticalFlowVideo

Computes per-pixel motion between consecutive frames, encoded as a texture (RG = motion XY, B = magnitude).

Inputs

Port	Type	Default	Range	Notes
texture	texture	,	,	Current frame; compared against the previous one.
sensitivity	float	10	,	Motion amplification.

Outputs

Port	Type	Default	Range	Notes
texture	texture	,	,	R/G = motion X/Y (0.5 = none), B = magnitude.

Estimates flow with a Horn-Schunck-style gradient method against the previous frame. The encoded vectors are ready to drive displacement, speed masks or motion-reactive effects - feed it into a Noise Displace or read its blue channel as a motion amount. Pre-blur and a motion threshold are inspector fields.

Depth Estimation

depthEstimationVision

Produces a grayscale monocular depth map from any 2D image, with range remap, invert, smoothing and depth-slab isolation.

Inputs

Port	Type	Default	Range	Notes
texture	texture	,	,	Image to estimate depth from.
near	float	0	0 – 1	Lower depth bound; maps to white.
far	float	1	0 – 1	Upper depth bound; maps to black.
smoothing	float	0.4	0 – 1	Temporal EMA; 1 = frozen frame.
invert	bool	false	,	Swap near/far so close = black.
isolateEnabled	bool	false	,	Enable depth-slab isolation.
isolateTarget	float	0.5	0 – 1	Center depth of the band.
isolateWidth	float	0.2	0 – 1	Width of the band.
isolateFalloff	float	0.08	0 – 1	Smooth falloff outside the band.

Outputs

Port	Type	Default	Range	Notes
depth	texture	,	,	Grayscale depth map (16-bit float).

Runs the DepthAnything V2 CoreML model to infer depth from a single image - no depth camera required. A Metal post-pass remaps the range, inverts, smooths over time, and can isolate a depth slab. Pair it with Depth Displacement or 3D Effects to push foreground and background apart.

Depth Key

depthKeyVideo

Masks a texture by an external depth map — isolate a depth band (near/far + target/width/falloff) and output it as masked alpha, cutout black, or a grayscale mask.

Inputs

Port	Type	Default	Range	Notes
texture	texture	,	,	Image to mask.
depth	texture	,	,	Depth map (e.g. from Depth Estimation, or a clip's own depth).
near	float	0	0 – 1
far	float	1	0 – 1
isolateTarget	float	0.5	0 – 1	Center depth of the kept band.
isolateWidth	float	0.25	0 – 1
isolateFalloff	float	0.1	0 – 1

Outputs

Port	Type	Default	Range	Notes
texture	texture	,	,

Drive one clip’s cutout from another clip’s depth: wire a depthEstimation (or any depth texture) into the depth input and keep only the slab you want. The output mode (alpha / cutout / mask) is an inspector toggle.

Blob Tracker

blobTrackerVision

Tracks moving or bright blobs in a texture and draws a bounding box around each one, with optional connecting lines.

Inputs

Port	Type	Default	Range	Notes
texture	texture	,	,

Outputs

Port	Type	Default	Range
texture	texture	,	,
overlay	texture	,	,
blobCount	float	,	,
blob1X	float	,	,
blob1Y	float	,	,
blob1Size	float	,	,

Detects blobs, assigns each a persistent ID across frames, and draws a box around it. The texture output passes the source through untouched; overlay is a transparent canvas with just the boxes and lines, composite it back over the video with a blend or feed it to a picture node. Detection runs fully asynchronously (GPU downscale plus a non-blocking readback) so it never stalls the render.

Two detection modes. Motion diffs each frame against the previous one, it finds whatever is moving and ignores anything static (best for video and live camera). Luma thresholds a mono channel, it finds bright regions whether they move or not (best for generators, 3D scenes, and stills). Pick the mono channel (luma / R / G / B) and, in Luma, optionally invert to track dark regions instead.

Tuning. Threshold sets the sensitivity; Min size / Max size filter blobs by their normalized long edge; Max blobs caps how many are kept (largest first). Move dist is the furthest a blob can travel between frames and keep its ID. Rate Hz throttles detection while the boxes glide at render rate, and Smoothing controls that glide (high = snappy, low = trailing). Proc px is the internal processing resolution, lower it if you need more headroom.

Drawing. Toggle the bounding boxes (thickness, padding, color), the connecting lines between blobs (mesh connects every pair within Link dist, nearest each blob to its closest neighbour, chain left-to-right), and a centroid dot. The blobCount and blob1X / blob1Y / blob1Size outputs expose the data (largest blob's centroid and size) to drive the rest of the graph.

Depth Displacement

depthDisplacementVision

Per-pixel fractal (FBM) distortion of the source, gated by a depth map so only chosen depths move.

Inputs

Port	Type	Default	Range	Notes
texture	texture	,	,	Source image.
depth	texture	,	,	Depth map (.r) gating displacement strength.
scale	float	0.02	0 – 0.2	Displacement magnitude.
displaceX	float	1	-2 – 2
displaceY	float	1	-2 – 2
frequency	float	4	0.1 – 32	Noise detail.
octaves	float	4	1 – 6	FBM octaves.
speed	float	0.5	-4 – 4	Animation speed (wall-clock).
roughness	float	0.5	0 – 1	Gain between octaves.
depthMin	float	0	0 – 1	Remap depth into a 0–1 mask.
depthMax	float	1	0 – 1
invert	bool	false	,	Invert the depth mask.
chromatic	float	0	0 – 2	RGB-split separation.

A feedback-style Post-FX (no texture output - it writes into the parent’s frame). Pair it with Depth Estimation so the foreground subject melts while the background sits still, or vice versa via invert.