Merge branch 'front-end' into 'master' (c8a65205) · Commits · Zane Safi Mroue / ObjectDetectionVSLAM

README.md

deleted100644 → 0

+0 −81

Original line number	Diff line number	Diff line
		# Documentation

		Nafis Abeer: nafis@bu.edu

		Rohan Kumar: roku@bu.edu

		Zane Mroue: zanem@bu.edu

		Samuel Gulinello: samgul@bu.edu

		Sanford Edelist: edelist@bu.edu

		### Description

		Visual Simultaneous Localization and Mapping (VSLAM) is the process of taking camera feed, as well as its position, and building a map of the current local world, specifically using visual input. This project uses this process, and builds upon it by also tracking objects within a frame. In this comes two problems: object detection, and then subsequent mapping and tracking of objects within a 3D space.

		### Implementation


		For simplicity, the general system framework is shown below:

		```mermaid
		graph LR;
		Z[Camera/Video] -->\|Input\| A
		A[VSLAM]-->\|KeyFrames\| B[YOLOv4];
		A-->\|Features\| C[Object Tracking];
		A-->\|Features\| E
		B-->\|Object Detection\|C;
		C-->\|Objects\| D[Database];
		D-->\|Objects\| E[GUI];
		```

		- Camera/Video: as of right now, we use prerecorded video as our examples and tests, but this system can be easily extended to real time camera systems or drone footage
		- the output are the frames that constitute the video
		- VSLAM: using the MATLAB VSLAM algorithm, this process takes the raw frame and does two things: finds "features", or important points used for tracking, and finds "keyframes", which are a subset of the entire set of frames that capture the most information about the video (i.e. movement gives points their 3D position)
		- the outputs are the keyframes and the features
		- YOLOv4: using a Java library, we perform deep learning using the YOLOv4 model, which is a convolutional neural network that takes a keyframe, and finds bounding boxes around each object that the model can discern
		- the output are the bounding boxes around each object for each frame
		- Object Tracking: the significant contribution using data structures and algorithms, this system takes the bounding boxes of each object (in 2D on a single frame), and the features (in 3D on the same frame), finds each feature in each bounding box, and then tries to rectify the objects in the current frame with objects found in past frames. We solve this by implementing a data structure called an [ObjectSet](./src/main/java/object_detection/ObjectSet.java). For each object that has already been found, we compare a new object, and if there is some percentage of similarity in features contained in both objects, we combine these two objects and update the database correspondingly.
		- there is further explanation and runtime analysis in the Appendix A
		- the output is an iteratively more accurate set of objects that correspond to reality
		- Database: for ease of retrieving, updating, and storing Objects and corresponding features, we use a MongoDB database
		- the output is storage for those objects from object tracking
		- GUI: for an outward facing display of our work, we implemented a Javascript UI, that creates a server, such that we can view the system's output in any browser.
		- the output is a clean point cloud view of objects and features that the camera has seen

		### Features

		> need to fill in this area

		# Code

		The following links:
		- [The branch and directory containing Java 17 Code to be executed](./../../tree/master/src/main/java)
		- [The data needed for the examples used for this system](./../../tree/master/src/main/java/vslam/KeyFrames)
		- [The testing code for this system](./../../tree/master/src/test/java/)

		# Work Breakdown

		Nafis Abeer:

		Rohan Kumar:

		Zane Mroue:

		Samuel Gulinello:

		Sanford Edelist:

		# Appendix

		### A: Runtime and Space Analysis of ObjectSet

		> TODO

		### B: References and Material Used

		> need to fill in references and also ALL LIBRARIES USED (MATLAB, YOLO, Javascript stuff, etc)

		#### Personal Access token
		Qzkmjrtrda1yGxkxyz8C

src/main/java/top/BackendJava.java

+1 −3

Original line number	Diff line number	Diff line
		@@ -3,7 +3,6 @@ package top;
		import object_detection.ObjectDetector;
		import org.springframework.boot.SpringApplication;
		import org.springframework.boot.autoconfigure.SpringBootApplication;
		import org.springframework.context.annotation.Bean;
		import org.springframework.context.annotation.Configuration;
		import org.springframework.stereotype.Controller;
		import org.springframework.web.bind.annotation.*;
		@@ -14,7 +13,6 @@ import org.springframework.web.servlet.config.annotation.WebMvcConfigurer;
		import java.io.FileNotFoundException;
		import java.util.ArrayList;
		import java.util.HashMap;
		import java.util.List;
		import java.util.Map;

		@SpringBootApplication
		@@ -30,7 +28,7 @@ public class BackendJava {
		}
		@RequestMapping("/")
		public String index(){
		return "html/index";
		return "inde";
		}

		@RequestMapping("/runProcess")

src/main/resources/templates/html/index.html

+26 −21

Original line number	Diff line number	Diff line
		@@ -4,25 +4,30 @@
		<head>
		<meta charset="UTF-8">
		<title>VSlam</title>
		<link rel="stylesheet" href="../style/main.css">
		<link rel="stylesheet" href="main.css">
		<!-- Include three.js -->
		<script src="https://cdnjs.cloudflare.com/ajax/libs/three.js/r128/three.min.js"></script>
		</head>

		<body>
		<div class="container">
		<p>Visual Simultaneous Localization and Mapping (VSLAM) is the process of taking camera feed, as well as its position, and building a map of the current local world, specifically using visual input. This project uses this process, and builds upon it by also tracking objects within a frame. In this comes two problems: object detection, and then subsequent mapping and tracking of objects within a 3D space. For more information go <a href="https://agile.bu.edu/gitlab/ec504/ec504_projects/group8/-/blob/master/README.md?ref_type=heads">here</a></p>
		<h3>Press the button below to start the workflow and view the output</h3>
		<input id="process" type="button" value="Start Workflow" onClick="startWorkflow();"/>
		<div id="loading"></div>

		<div class="container" id="resultContainer">
		<div class="image-container" id="imageContainer">
		<!-- Placeholder for image -->
		</div>
		<div class="objects-container" id="objectsContainer">
		<h4 style="margin: 0;">Select An Object To View Point Cloud With That Object Highlighted</h4>
		<p style="margin: 0;"><em>point cloud can be manipulated through rotate and zoom</em></p>
		<!-- Placeholder for Object List-->
		</div>
		</div>
		<input id="process" type="button" value="Start Workflow" onClick="startWorkflow();"/>
		<input id="drawPC" type="button" value="Show Point Cloud" onclick="drawPointCloud();" />
		<div id="loading"></div>

		<script src="../js/pointCloud.js"></script>
		<script src="../js/app.js"></script>
		<!-- <input id="drawPC" type="button" value="Show Point Cloud" onclick="drawPointCloud();" /> -->
		<script src="pointCloud.js"></script>
		<script src="app.js"></script>
		</body>

		</html>

src/main/resources/templates/js/app.js

+8 −4

Original line number	Diff line number	Diff line
		@@ -14,10 +14,12 @@ async function fetchImageAndObjects() {
		function displayImageAndObjects(imageUrl, objects) {
		const imageContainer = document.getElementById('imageContainer');
		const objectsContainer = document.getElementById('objectsContainer');
		const container = document.getElementById('resultContainer');

		// Create an image element
		const imageElement = document.createElement('img');
		imageElement.src = "/Users/sam/Documents/Documents/BU/EC504/FinalProjFrontEnd/keyframe.png";
		console.log(imageUrl);
		imageElement.src = 'http://127.0.0.1:5000/' + imageUrl;
		imageElement.alt = 'Image';
		imageContainer.appendChild(imageElement);

		@@ -38,6 +40,8 @@ function displayImageAndObjects(imageUrl, objects) {

		// Append objects list to the objects container
		objectsContainer.appendChild(objectsList);

		container.style.visibility = "visible";
		}


		@@ -53,8 +57,9 @@ function displayLoading() {
		function hideLoading() {
		loader.classList.remove("display");
		}

		async function startWorkflow(){
		const startButton = document.getElementById("process");
		startButton.style.display = "none";
		displayLoading();
		try {
		const response = await fetch('http://127.0.0.1:5555/runProcess');
		@@ -67,6 +72,5 @@ async function startWorkflow(){
		} catch (error) {
		console.error(error);
		}
		await fetchImageAndObjects();
		}

		// fetchImageAndObjects();
		No newline at end of file

src/main/resources/templates/style/main.css

+5 −0

Original line number	Diff line number	Diff line
		@@ -14,6 +14,11 @@ body {
		margin: 20px;
		}

		#resultContainer
		{
		visibility: hidden;
		}

		.container {
		display: flex;
		justify-content: space-between;