JIM ROGERS CSCI 460 CAPSTONE
  • Home
  • Blog
  • Resume
  • Contact
  • Program Files
  • Home
  • Blog
  • Resume
  • Contact
  • Program Files
Organizing Big Data into something useful

Illustrating 




Text  by




Tag  Clouds

photo Credit: Jim Golden

Project Background

Massive amounts of numerical and textual information are produced daily. Survey questions that use “multiple choice” formats can usually be easily processed and organized electronically. However, surveys with open-ended questions and comment fields are often difficult to organize and analyze. A large number of surveys like the SNC SOOTS, opinion polls, and online reviews add to the complexity. Public opinion and interests can be determined by analyzing the content of various social media, but the data is simply overwhelming - from www.internetlivestats.com/twitter-statistics: Every second, on average, around 6,000 tweets are tweeted on Twitter (visualized here), which corresponds to over 350,000 tweets sent per minute, 500 million tweets per day and around 200 billion tweets per year. While the amount of data is overwhelming, the ability to represent “big-data” in meaningful, visual ways is increasingly more important.

Project Description

Build a software system that collects a large amount of text, processes and stores the data, and then constructs a tag cloud that represents the frequencies of a specified characteristic.
​General Requirements:   
  1. Describe the most frequent words in the cloud.
  2. Input the number of tags for the cloud
  3. Use visual attributes to describe the tags (size, color).
  4. Try to minimize the white space in the cloud.
  5. Consider orientation and other parameters.
  6. Tag clouds do not contain overlapping words.
  7. Design the algorithm so that it handles clouds of different shapes.
  8. Automate when feasible with user assist when necessary.
  9. The application should be general enough to easily handle more than one set of data.
  10. Clicking on a visual gives more depth and/or another perspective of the information.​


Desired outcome: User interface that will produce a readable tag cloud of various shapes.
Take these for example and click to follow link.
Picture
Picture
The top two examples are website applications.


​This example to the right is a google docs add-on.
Picture
Create a free web site with Weebly