Update README.md (6f0bb20b) · Commits · Zane Safi Mroue / ObjectDetectionVSLAM

README.md

+27 −3

Original line number	Diff line number	Diff line
		@@ -34,8 +34,8 @@ graph LR;
		- the output are the frames that constitute the video
		- VSLAM: using the MATLAB VSLAM algorithm, this process takes the raw frame and does two things: finds "features", or important points used for tracking, and finds "keyframes", which are a subset of the entire set of frames that capture the most information about the video (i.e. movement gives points their 3D position)
		- the outputs are the keyframes and the features
		- YOLOv4: using a Java library, we perform deep learning using the YOLOv4 model, which is a convolutional neural network that takes a keyframe, and finds bounding boxes around each object that the model can discern
		- the output are the bounding boxes around each object for each frame
		- YOLOv4: Utilizing a Java library, we perform deep learning using the YOLOv4 model, a convolutional neural network that identifies objects in keyframes and outlines them with bounding boxes. YOLO (You Only Look Once) is designed for speed and accuracy in real-time object detection, processing images in one evaluation and using a single neural network for the prediction of bounding boxes and class probabilities directly from full images.
		- the output are the bounding boxes around each object for each frame. We also record information about these bounding boxes in each frame into a csv file
		- Object Tracking: the significant contribution using data structures and algorithms, this system takes the bounding boxes of each object (in 2D on a single frame), and the features (in 3D on the same frame), finds each feature in each bounding box, and then tries to rectify the objects in the current frame with objects found in past frames. We solve this by implementing a data structure called an [ObjectSet](./src/main/java/object_detection/ObjectSet.java). For each object that has already been found, we compare a new object, and if there is some percentage of similarity in features contained in both objects, we combine these two objects and update the database correspondingly.
		- there is further explanation and runtime analysis in the Appendix A
		- the output is an iteratively more accurate set of objects that correspond to reality
		@@ -66,21 +66,45 @@ The following links:
		# Work Breakdown

		Nafis Abeer:
		- Initial project management and task division
		- Dataset aquisition (initial research)
		- VSLAM Implementation (aquiring 3d points and associated 2d pixel locations of points in each frame)
		- YOLOv4 Implementation (creating bounding box meta data)
		- Half of points within bounding box algorithm (implemented classes made by Rohan)

		Rohan Kumar:
		- Serverside managment (oversaw connection of various features)
		- Object set classes and bounding box classes (all classes related to points)
		- Object disjoint set algorithm for tracking objects across frames
		- Debugged majority of code during Integration
		- Built Maven system for building and running

		Zane Mroue:
		- Database management (MongoDB)
		- Springboot server (connection to frontend)
		- Server Integration
		- Data retrieval
		- Objectset updates
		- Efficient data storage

		Samuel Gulinello:
		- GUI Creation
		- All JavaScript, HTML, CSS files
		- Utilized and Configured ThreeJS for pointcloud
		- Worked on server configuration

		Sanford Edelist:
		- Built unit tests for Object Detection
		- Built unit tests for YOLO
		- Debugged unit tests

		# Appendix

		### A: References and Material Used

		- The UI used a Library called [ThreeJS](https://threejs.org/). This library is responsible for creating the visualization of the pointcloud.
		- The object detection uses [YOLOv4](https://arxiv.org/pdf/2004.10934.pdf) from this library
		- The object detection uses [YOLOv4](https://github.com/bytedeco/javacv/blob/master/samples/YOLONet.java) from the javacv library
		- The server was written with [Spring Boot](https://spring.io/projects/spring-boot)
		- The database used was via [MongoDB](https://www.mongodb.com/), and the database itself was hosted on MongoDB provided free-tier server
		- The testing was done using [JUnit](https://junit.org/junit5/)
		- The mathworks Monocular VSLAM tutorial is used to guide our vslam_implemenation matlab script [VSLAM](https://www.mathworks.com/help/vision/ug/monocular-visual-simultaneous-localization-and-mapping.html)