top of page

   Data Curation/ Visualization

ICERT REU

This summer for my REU internship at TACC I worked on three different type of projects. The first project I  was able to develop a web application that could be used for managing the data curation process and in making the data discoverable throughout its lifecycle. A user will upload the data using an online form (to be created using javascript/HTML/etc.) which could be accessed from a phone (mobile-friendly) or the computer.
When the data is uploaded using the form, the user could add their metadata tags or can select the tags from the list of suggested tags. After filling the form, the user will click on the submit button. Upon clicking on submit, the data entered in the form will be saved in two ways first,  the responses on the form (except the file/folder that is attached) will go into an Excel sheet and the second way is that the files/folders will get copied to a shared google drive.  The location of the uploaded files will be saved in the Excel sheet that is the path to the Google drive at which the file/folder is stored, and get appended as a column in the Excel sheet.
The second project I  worked on within a group called Enabling Large-Scale Document Analysis On Stampede Through Moblie Phone, the primary goal of the project is to develop a scalable, yet easy to use, solution for analyzing large, complex and heterogeneous data collections residing within a remotely accessible storage locations. An Android application is being developed to do the document analysis on the Stampede supercomputer at TACC. We were able to use the data that were gathering from the data form to perform some part of this project.  I was able to use the R programming language to be able to perform visualization and analysis  to create and developed a  tree map with the data that were collected.
The third project I worked on was to manage Twitter user's data tweets using R script for topic modeling. 

bottom of page