Assignment 7: Big Data Paper 2 Pre-workΒΆ

This is the research for the second report. You need to turn in:

  • Initial data analysis (in MS Word format)
  • An outline
  • A bibliography
  • Supporting Excel files

You can turn in the analysis, outline, and bibliography as one document.

The goal for the paper will be take some of the big data analysis techniques we learned, and apply them to learn something new.

Do not talk about what you could do. Actually do it. This has been the most frequent mistake made in the years past. Don’t say “From this, we could determine what players we need to cut from our team.” Instead say, “Because of x, y, and z, we should cut Bob, Fred, and George.”

The final paper should have the following parts:

  • Introduction and thesis
    • Because of the nature of this report, the main topic will be the processing of the data and its analysis. You may have one analysis on football and another analysis on smoking in this report. Or you could do both on football. Your goal is to show the reader what you can do with the data. Write your introduction accordingly.
    • Clearly label the thesis.
  • Text Processing
    • Search up something out of the text files.
    • Create something new. Don’t just re-create what we did with the name processing. At the very least, compare name trends in different states.
    • Describe what you are doing for processing, and how you are doing it.
    • Talk very specifically about the results.
    • If you want, you can feed this data into the graph and/or pivot table.
    • Do not do the exact same thing we did in class. Process different text files than the “name” text files. (You can compare names between states, that would be new.)
  • Graph
    • Graph the data.

    • Label your x and y axis. Give the graph a title.

    • Describe in detail what this graph is showing. Don’t just graph some random thing. Graph data that is informative.

    • Then describe why it matters, and what exactly you can derive from that data.

    • Not all data sets I’ve listed are good for pivot tables. For pivot tables you need data that has multiple categories on each line. Out of the Data Sets, I’d recommend looking at:

      • California ACT/SAT data
      • Minnesota Payroll Data
  • Pivot Table
    • Use a pivot table.
    • Make one or more pivot tables that inform the reader.
    • Specify how you created the table. Where you got the data from.
    • Tell why the data you are showing matters and what you can do with it.
    • Create a pivot table that shows the power of what a pivot table can do. Don’t just create a pivot table that is a recreation of the original table. Part of the purpose here is to show the user what a pivot table can do.
  • Conclusion

  • Bibliography

  • Also - Upload your supporting Excel files.

  • The paper should have 1,000 words or more. So 20-25 points in your outline.

However, this part of the assignment is just the pre-work. For this assignment I am looking for:

  • Is the data analysis included in the outline? Make sure the data gets copied from your original sources into the MS Word document. I’ll only look at the Word document, so if it only exists in an Excel document you won’t get credit for it.
  • Do you have a first version of your text processing?
  • Do you have a first version of your pivot table?
  • Do you have a first version of your graph?
  • Do you have an outline showing what you will talk about?
  • Do you show in the outline where you will cite items?
  • Do you have at least one additional source that you’ve pulled into your outline discussion? (Not a raw data source.)
  • Do you have a bibliography that cites the data that you used? And any other background info?s
  • Do you also have the supporting Excel files you are turning in?