Fundamentals of data analysis

Published on May 29, 2017   25 min
Please wait while the transcript is being prepared...
Hello. My name is Dr Brian Blank, and I am a Finance faculty member at Mississippi State University. Today, I will be discussing "The Fundamentals of Data Analysis". Specifically, we will focus on how to use large datasets once the data has been gathered and cleaned.
As of late, one of the topics frequently covered by the media and researchers alike is the use of big data. Because data are becoming increasingly prevalent, being able to analyze large datasets is more important than ever. Datasets are simply a collection of observations or individual pieces of information coded together in an uniform manner, but using a large dataset will allow us to learn more about reality. Thousands of observations can be used to understand what is taking place in the outside world. This is simply due to large quantities of computational power that are now available with modern technology. As data become more common and larger than before, datasets will increasingly be used more frequently and data analysis will be more informative than it has been before. With more information, we can learn using more data, and this will add additional value to professionals in various industries.
Before we begin, I want to talk about some of the ideas that we will be discussing today. We will first discuss ways to summarize our data. Summary statistics, which are also sometimes referred to as descriptive statistics, can be used to help us understand the data that we will be analyzing. These statistics describe characteristics of the data, as well as providing an idea what observations are or are not included in our dataset. Next, we will spend time analyzing data and summarizing the relation between variables. In particular, univariate analysis is simply analysis using a single variable and how it relates to another. As a result, we'll begin with reviewing how two variables are correlated, which tells us how changes in one relate to changes in another variable. Then we will incorporate additional variables and perform multivariate analysis. On the other hand, multivariate analysis allows us to incorporate information that could be related to our variable of interest and provides additional confidence in our results by controlling for the effects of those other variables. For the purposes of our discussion today, we will focus on financial information. That is, information about a firm that comes from financial statements. However, most of the principles are also easily applicable to other forms of data. An example of multivariate analysis, using financial information could be a situation where we want to know if it is more profitable for a firm to hold higher levels of debt in order to finance the firm's assets. However, we may want to consider the size of the firm even if it is not our main variable of interest. Size is important because large and small firms may have other differences between them that could also influence the firm's profitability and data holding. We will discuss these ideas more later on in our analysis.