6.085 Problem Set Two Regression The CSV file contains various biological data from several species of bats (source: Mating System and brain size in bats, Pitnick et. al.). The BodyMass, BrainMass, and TestesMass columns are all in grams. The NeoCortexVolume is in mm^3. Female Promiscuity indicates whether female bats mate with multiple males during their lifetime. Finally, diet=1 indicates a fruit-based diet; diet=2 indicates an "other" diet. Complete the following analysis: 1. Biologists usually use the log units for regression analysis. What is the underlying assumption when using log units instead of standard units? For the rest of the assignment, we'll trust the biologists' model intuition and use log units. 2. Do an exploratory data analysis of BodyMass, TestesMass, BrainMass, and NeoCortexVolume. Describe and provide a possibe explanation for the correlations that you see. Neocortical volume is often used as an indicator for higher cognitive function. How well correlated are BrainMass and NeoCortexVolume? 3. Compute a linear regression fit to predict TestesMass based on BrainMass alone, BodyMass alone, and BrainMass and BodyMass together (all in log units). Summarize your findings. 4. Compute a linear regression fit to predict TestesMass and BrainMass based on BodyMass (all in log units). Plot the residuals against FemalePromiscuity, MatingSystem, and Diet. Summarize your findings. Do these help explain the results of Problem 3? 5. In the previous problem set, we considered whether we could make claims about individuals given aggregate data. Here, we are using measurements of an individual to make claims about populations of bats. Is this reasonable? Why or why not?