- Location: Turn our eyes to some direction (left, right, top, bottom, centre, or some random point)
- Boundary: Detect the boundary
- Classification: Any object? classify it.
YOLO (You Only Look Once) network does somehow similarly:
- Location: Grid up an image (or video frame) to NxN, for example 3x3, that is also left, right, top, bottom, centre, etc.
- Boundary: From a point in a grid cell, predict a rectangle, that is boundary
- Classification: Any object? classify it.
Links:
OpenCV: https://opencv.org
No comments:
Post a Comment