RCVNet (Vote from the Center)

NAVIGATION

Home

Research

Bookshelf

Garden

FIND ME ON

GitHub

Home

Research

Bookshelf

Garden

RCVNet (Vote from the Center)

🌱

Pasted image 20231211111017.png Pasted image 20231211113619.png # End-to-end 1. Input: RGB-D Image, uses CNN to process image 2. Feature Extraction: 1. Run CNN over image to extract features. Here we output $\hat S$ which is our predicted foreground object (binary semantic segmented output) 2. Also predict $\hat M_{0}$ which is the unsegmented estimate of radii for each pixel in the image. 3. Distance Estimation 1. Here we generate $k$ predicted radii from each pixel in the ground-truth binary segmented outline, $S$ , of our previously predicted radii, $\hat M_0$ to each $k$ key points in the form $\hat M_{1}$ . 4. 3D Accumulator Space: 1. Recall from Hough Voting, that we have our accumulator space which is a discrete 2x2 matrix of cells where we increment the value of a cell for every line that passes through it (the line corresponds to the key points in our image). 2. We increment the accumulator space (i.e. the voxel space which is 3D here) at every cell that the surface of these spheres are contained within. 3. The $k$ highest cell values (highest intersection points) correspond to our $k$ key points we’re trying to predict. 5. Loss! 1. Compare predicted segmented with ground-truth and predicted keypoints with ground truth keypoints.