Skip to content

Instantly share code, notes, and snippets.

@meshula
Created October 30, 2025 21:51
Show Gist options
  • Select an option

  • Save meshula/3b1272a48aeb7b90763a17cd74ad1dcc to your computer and use it in GitHub Desktop.

Select an option

Save meshula/3b1272a48aeb7b90763a17cd74ad1dcc to your computer and use it in GitHub Desktop.

Wormius Realtimeus

Doable in real time with a compact pipeline and a small nonlinear solver per frame. I'll give a complete, practical approach you can implement right away:

  • assumptions I’ll make (don’t worry if some don’t hold; I include fallbacks):

    1. camera intrinsics (K) are known (focal length (f), principal point). If not, I show a fallback.
    2. each bead has known physical radius (R) and the known center-to-center spacing sequence (s_i).
    3. each bead has a unique color and the colors appear in the same sequence left→right in the string (you said sequence is known).
    4. you have a measured ground plane (plane equation in camera coords) ( \mathbf{n}^\top \mathbf{X} + h = 0) or at least the height (z_{plane}) in camera coordinates.
    5. beads are nearly spherical (we’ll model them as spheroids/ellipsoids when needed).

Outline: detect bead 2D centers → assign by unique color/sequence → build 3D position solver that fits depths along camera rays subject to spacing constraints and plane-contact constraints (≥30% touching) → fit oriented ellipsoids for visualization using local tangent + radius → per-frame warm-start + few LM iterations for real-time.


1) Image preprocessing & 2D bead detection

  1. Convert image to an appropriate color space (HSV or Lab*) and threshold per known bead color to get masks.

  2. For each color mask:

    • morphological opening/closing to clean.
    • find contours; compute sub-pixel centroid (moment or elliptical fit) and image ellipse (cv2.fitEllipse) for ellipse axes & orientation.
  3. You now have image points (u_i = (x_i,y_i)) and a 2D ellipse per bead (if you want shape cues).

  4. Order beads according to the known color sequence (so you have correspondences between image points and physical bead indices).

Implementation tip: use a tiny per-color lookup table (HSV ranges), and filter by expected area to reject false detections. Warm start color thresholds adaptively if lighting shifts.


2) Ray back-projection

For a bead image point (u_i = (x_i,y_i)) and intrinsics (K), compute normalized camera ray [ \mathbf{r}_i = K^{-1} [x_i, y_i, 1]^\top ] and unit ray direction (\hat{\mathbf r}_i = \mathbf{r}_i/|\mathbf{r}_i|).

Unknown 3D bead center (P_i) lies on the ray: (P_i = d_i , \hat{\mathbf r}_i) where (d_i>0) is the depth along ray.

If intrinsics are unknown:

  • approximate focal length from known physical bead radius using apparent radius in pixels (r^{px}): (d \approx f \cdot (R / r^{px})). If (f) unknown, you can bootstrap by assuming a reasonable (f) (e.g. sensor width / FoV) or run a short calibration with a calibration pattern. This will be less accurate but usable.

3) Constrained depth estimation (core solver)

We want 3D positions (P_i(d_i)) that

  • project near the observed (u_i) (data term),
  • satisfy center-to-center spacing (||P_{i+1}-P_i|| = s_i) (geometric constraint),
  • enforce that at least (m = \lceil 0.3 N\rceil) beads are contacting the ground plane: for those beads, enforce the plane-contact constraint ( \mathbf{n}^\top P_j + h = -R) (touching plane at distance R along normal). (Note: plane equation sign depends on convention; adapt to your measured plane.)

Because selecting which beads touch the plane is combinatorial, use this practical approach:

  • heuristic selection: pick the (m) beads with largest image y (lowest on image) OR smallest depth from an initial solve; these are likely to be on the plane. Force those to satisfy the contact constraint (hard constraint).
  • optionally run two attempts: pick the top-m by image y and try solver; if residuals high, try top-m by depth; pick the lower residual result.

Mathematical optimization (unknown vector (d = [d_1..d_N])):

Minimize [ \mathcal{E}(d)= \sum_i w_i | \pi(d_i \hat r_i) - u_i |^2

  • \lambda_s \sum_{i} \big(|d_{i+1}\hat r_{i+1} - d_i \hat r_i| - s_i\big)^2
  • \lambda_p \sum_{j\in\mathcal{P}} \big(\mathbf{n}^\top (d_j \hat r_j) + h + R \big)^2 ] where (\pi(\cdot)) is projection with intrinsics (K), (\mathcal{P}) is the set of indices forced to touch plane, (w_i) weights from ellipse confidence, and (\lambda_s,\lambda_p) are penalties.

Use scipy.optimize.least_squares (Levenberg–Marquardt or trust-region-reflective) to solve for (d). Per-frame: warm-start with previous (d).

This optimization is small (N variables where N is number of beads — usually tens), so it runs quickly in C++/ceres or Python+NumPy with a few LM iterations. In optimized C++ with ceres you'll get >30Hz easily for N~20; in Python expect 10–30Hz depending on machine if you limit iterations.


4) Estimating ellipsoid orientation & radii

Once bead centers (P_i) are known, align an ellipsoid (center (P_i), rotation (R_i), axes (a,b,c)):

Heuristic practical model:

  • principal axis of ellipsoid aligned with local tangent of the string:

    • compute tangent (T_i) as normalized (P_{i+1}-P_{i-1}) (endpoints use nearest neighbor).
  • choose one principal axis along (T_i), the other two perpendicular; set the two transverse axes equal to physical bead radius (R) (so (a=R, b=R)).

  • the depth-axis (c) (axis along tangent) can be set to (R) for almost-spherical beads, or adjusted to better match the observed 2D ellipse: use the projected ellipse from the 3D ellipsoid and solve for (c) that best matches the image ellipse major/minor ratio. There's a closed form mapping from a 3D ellipsoid (symmetric positive definite matrix Q) projected into image to a 2D conic. Solve for (c) by least squares (low-dim scalar solve).

If beads are nearly spherical, using sphere (all axes = R) is simpler and robust.

To render an ellipsoid so its projection matches the observed fitted 2D ellipse more closely, use the method in "Image of a quadric is a conic" (Torr & Zisserman style). Practically, for each bead:

  • compute image ellipse matrix (C_i) from cv2.fitEllipse
  • set up the relation between 3D quadric (Q_i) and image conic (C_i \propto K^{-T} [R_i ,, t_i] Q_i^{-1} [R_i ,, t_i]^T K^{-1}) and solve for unknown scale in ellipsoid axes. That is heavier algebraically but yields better visual alignment.

5) Per-frame real-time considerations & acceleration

  • Warm start with previous frame's (d).
  • Solve only when beads moved or every k-th frame; use optical flow for tiny motions and skip heavy optimization.
  • Use analytic Jacobians for the LM solver for speed (I can provide if you want).
  • C++ (ceres-solver) implementation recommended for production; Python scipy.optimize.least_squares is fine for prototyping.
  • If you have GPU & CUDA, move image processing to GPU (OpenCV CUDA) and run solver in CPU.

6) Robustness tips

  • If occlusion or missed beads: solve with only visible beads; use spacing penalty to keep chain consistent and infer occluded bead positions.
  • If beads move relative to string (rotation), rely on color sequence — that gives strong correspondence.
  • If lighting changes, adapt HSV thresholds with small per-frame feedback (tune via running mean of bead colors).

7) Concrete Python sketch (core solver)

This is a compact, runnable sketch for the depth solver. You’ll need OpenCV and scipy.

import numpy as np
import cv2
from scipy.optimize import least_squares

# inputs (per frame)
K = ...                 # 3x3 intrinsics
rays = ...              # Nx3 unit rays in camera coords (from backproject(image points))
u = ...                 # Nx2 image points (x,y)
s = ...                 # (N-1) spacing array in meters
R_bead = 0.005          # bead radius in meters (example)
plane_n = np.array([0,0,1.0])  # plane normal in camera coords (example)
plane_h = -0.0          # plane constant: n^T X + h = 0 (adapt to your plane)
N = len(rays)
m = int(np.ceil(0.3 * N))  # how many must touch

# helper: project point to image
def project_point(X, K):
    x = K @ X
    return (x[:2] / x[2])

# residual function
def residuals(d):
    # d: N depths
    res = []
    # reprojection residuals
    for i in range(N):
        P = d[i] * rays[i]
        p = project_point(P, K)
        res.append(p[0] - u[i,0])
        res.append(p[1] - u[i,1])
    # spacing residuals
    lambda_s = 50.0
    for i in range(N-1):
        Pi = d[i]*rays[i]
        Pj = d[i+1]*rays[i+1]
        dist = np.linalg.norm(Pj - Pi)
        res.append(lambda_s * (dist - s[i]))
    # plane-contact residuals: pick indices with largest image y (lowest in image)
    # (assumes image origin at top-left)
    idx_sorted = np.argsort(-u[:,1])
    for j in idx_sorted[:m]:
        Pj = d[j]*rays[j]
        res.append(100.0 * (plane_n.dot(Pj) + plane_h + R_bead))
    return np.array(res)

# initial guess: simple proportional depth from pixel size or constant
d0 = np.ones(N) * 1.0  # 1m initial
res = least_squares(residuals, d0, method='lm', max_nfev=60, ftol=1e-6)
d_est = res.x
# compute 3D centers
P_est = (d_est[:,None] * rays)

This is the minimal core; expand with covariance, weights, bounds (d>0), and good initial guesses.


8) Putting it all together (pipeline summary)

  1. per-frame: color-segment → centroid + ellipse fit → rays r_i.
  2. order by known color sequence → produce correspondences (u_i \leftrightarrow) bead index.
  3. build rays, spacing vector (s_i), plane parameters.
  4. choose (m=\lceil0.3N\rceil). Select candidate contact beads (lowest image y or smallest predicted depth).
  5. run least_squares solve for depths (d) with spacing + plane penalties.
  6. compute 3D centers (P_i).
  7. compute local tangents and build ellipsoid orientation; render/supply ellipsoid geometry to your renderer.
  8. warm-start next frame from current solution.

9) Edge cases & recommended improvements

  • If the chain has large curvature and the spacing constraints are tight, the solver can become nonlinear — use stronger (\lambda_s) or parameterize positions by a spline with along-string arc-length variables to reduce DOF.
  • If beads sit on plane in multiple poses, try combinatorial selection of which beads touch plane for a handful of top candidates (cheap if N small).
  • For best speed & robustness: implement solver in C++ + ceres and use analytic Jacobians. Offload color segmentation to GPU.

Next steps

  • provide a fuller Python reference implementation (color segmentation → centroid → solver → ellipsoid fit) that you can run on example frames,
  • give the analytic Jacobian for the residuals (for speed),
  • or port the solver to a ceres-based C++ snippet for real-time 60Hz+.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment