FIND ME ON

GitHub

LinkedIn

RCVNet (Vote from the Center)

🌱

Pasted image 20231211111017.png Pasted image 20231211113619.png # End-to-end 1. Input: RGB-D Image, uses CNN to process image 2. Feature Extraction: 1. Run CNN over image to extract features. Here we output S^\hat S which is our predicted foreground object (binary semantic segmented output) 2. Also predict M^0\hat M_{0} which is the unsegmented estimate of radii for each pixel in the image. 3. Distance Estimation 1. Here we generate kk predicted radii from each pixel in the ground-truth binary segmented outline, SS, of our previously predicted radii, M^0\hat M_0 to each kk key points in the form M^1\hat M_{1}. 4. 3D Accumulator Space: 1. Recall from Hough Voting, that we have our accumulator space which is a discrete 2x2 matrix of cells where we increment the value of a cell for every line that passes through it (the line corresponds to the key points in our image). 2. We increment the accumulator space (i.e. the voxel space which is 3D here) at every cell that the surface of these spheres are contained within. 3. The kk highest cell values (highest intersection points) correspond to our kk key points we’re trying to predict. 5. Loss! 1. Compare predicted segmented with ground-truth and predicted keypoints with ground truth keypoints.