Loading README.md +6 −0 Original line number Diff line number Diff line Loading @@ -236,6 +236,8 @@ Removed anchor and query params from urls that are scraped from site to avoid sc Added small version of brown corpus for quick use. Work on the Intro, background, features, design, analysis, and future improvements sections of the slides ### Leon Long Worked on Crawler Module: implemented data compression and storage. Loading Loading @@ -268,6 +270,8 @@ Added the metadata input functionality to all the modules Implemented change language feature in GUI, which allows the user to switch between languages Work on the Intro, background, features, design and analysis sections of the slides ### Tejas Singh Worked on base functionality of crawler: implemented Jsoup, basic data structures (such as the URL queue), and CLI (for use with files). Loading @@ -275,3 +279,5 @@ Worked on real-time feedback: added system for timing methods and computing proc Worked on social media integration: added social media option to CLI, created new method for parsing Reddit user pages. Worked on the crawler analysis slide. slides.pptx 0 → 100644 +856 KiB File added.No diff preview for this file type. View file Loading
README.md +6 −0 Original line number Diff line number Diff line Loading @@ -236,6 +236,8 @@ Removed anchor and query params from urls that are scraped from site to avoid sc Added small version of brown corpus for quick use. Work on the Intro, background, features, design, analysis, and future improvements sections of the slides ### Leon Long Worked on Crawler Module: implemented data compression and storage. Loading Loading @@ -268,6 +270,8 @@ Added the metadata input functionality to all the modules Implemented change language feature in GUI, which allows the user to switch between languages Work on the Intro, background, features, design and analysis sections of the slides ### Tejas Singh Worked on base functionality of crawler: implemented Jsoup, basic data structures (such as the URL queue), and CLI (for use with files). Loading @@ -275,3 +279,5 @@ Worked on real-time feedback: added system for timing methods and computing proc Worked on social media integration: added social media option to CLI, created new method for parsing Reddit user pages. Worked on the crawler analysis slide.