Visualizing Text - David Ferris - 2014 - About The primary purpose of this document is to inform the reader where different files can be found throughout the project. The easiest and best way to download the project files is to download the two major project .zip folders. These are "visualizing_text" and "capstone-visuals", which can be downloaded from the "Visualizing Text" and "Web Visuals" links. Downloading the projects in this fashion will allow you to bypass the process of assembling the project from scratch and adding in the needed files. This would require you to also add the needed text files to the project directories. 1.) The C++ Application The C++ project is entitled "visualizing_text". The entire project can be downloaded in .zip form via the "Files" page of the project website. Individual project files can be downloaded from the same page, in either their proper forms (depending on language) or in .txt form. These files are: Main.cpp, Hash.h, and Word.h. Main.cpp contains most of the program logic. It is from this file that the program creates a Hash and Words (using the Hash and Word classes), and inserts and retrieves data from the data structures. This file also handles the reading and writing of input and output files, which include: sample_in, sample_shared, and sentences (all .txt). Hash.h contains the definition of the Hash class. This is the primary data structure that is used in Main.cpp. This class contains member functions that handle creation and destruction of an object of the class, insertion of objects into the class, and retrieval of objects from the class. There are also a few smaller functions that perform specific functions that allow the larger functions to run properly. Word.h contains the definition of the Word class. This is the data type that is stored in the Hash. The Word class has member functions that perform purposes such as construction and destruction, along with tracking of sentences in which a Word appears. 2.) The Web Application The web application, entitled "capstone-visuals", can also be downloaded in zip form via the "Files" page, or individual files can be downloaded in their PHP, HTML, CSS, and JS forms. These files are: index.php, about.html, style.css, and scripts.js. In order for the css styles to properly function, the bootstrap css files must also be located within the project's "css" folder. These files are included in the full project download, but can also be downloaded as a .zip folder. Index.php serves as the main page of the web app. This file uses PHP to read data from the sample_shared.txt file, and then uses this data to fill in HTML tags throughout the page. This file provides the HTML for the main page, echoing code to build the layout. About.html is a static HTML page that provides information on use and modification of the web application. Style.css is a short stylesheet that, in conjunction with the bootstrap css files, dictates the styling of the about and index pages. Scripts.js is a short JavaScript file that contains click handlers for each of the top N words on the index page. 3.) Text files used by the two applications The sample_in text file contains the text that the user wishes for the C++ application to process. This is the raw text that words and sentences will be pulled from. Insert into this file the input text, and make sure that the file is saved in the visualizing_text project folder. The sample_shared text file is the file that is produced by the C++ application, written in the WriteResultsToFile function of Main.cpp. This file is called "shared" because it is also read by the web application, and thus is "shared" by the two projects. This file will be generated by the C++ application, and thus will be located within its project folder, but a copy of the file must be placed within the web application's project folder on the server. Using an FTP client such as WinSCP or CyberDuck can help with this. The sentences text file is another text file that is generated by the C++ application. This file is used solely by the C++ application, so you do not need to do anything with it. It is used by the program to store each of the sentences from the input file, one on each line. The ignored_words file is the final text file that is used by the project. This file is also used solely by the C++ side of the project, where it is only read from, never written to. This file contains a list of words that we do not want to have inserted into the Hash, for they are considered to be useless. We want to ignore them. There is one word on each line, to make it easy to read into an array. Information on program use can be found in the How To file. Information on project details can be found in the Project Definition file. My PowerPoint presentation and notes can be found in the Presentation file.