2. A Few Keywords and Definitions for Understanding Statistics - I

  -  by Dr. Prafulla Dikshit      (6-8 minutes read)

                                                                                                                                                               In the last blog post, we talked about the linkages between research, statistics, and data and got an overview of elementary statistics. We need to understand how or why a study of statistics would be critical for not just researchers but for anyone interested in improving their daily lives through better-informed decision-making. Further, we'll talk about several keywords and definitions typical to the statistics field and are crucial to lay a solid foundation for statistical knowledge.

2.1  Data and Statistics

I'm sure all of you must have heard this term - data. Of course, we also touched upon it in the previous blog post and described its linkages with research and statistics. Now, we want to understand it slightly better, rather, a lot better, going ahead. So, what exactly comes to your mind when you think about data? Let me guess - while thinking of data, you may be reminded of observations, the information we collect, for example, in a survey, or let’s say simply while making a list of students interested in participating in a college event, along with, maybe their preferences for specific activities. So it may be scientific data or something like weather data or any other type of data. These are basically all the numbers and information we're collecting for a specific purpose, which we may call observations. Now, why is this important for statistics? It's because what we do in statistics is analyze this data, interpret it, and use it for making certain informed decisions. So, when we collect this data, we do a lot of things with it (process it), and all this bunch of stuff that we do, is classified under the term statistics. In other words, statistics is collecting, analyzing, summarizing, interpreting the collected data, and then drawing meaningful conclusions from it. The last two steps namely interpreting and drawing conclusions are the most critical steps or parts of statistics. So, like in one of the above examples, imagine you are one of the students collecting the data of those interested in the college event. If you simply collect some data in a nice format and take it to your professor or mentor, he will probably say, "what does it even mean? I don't care what you have collected”. He will not be interested in it because it's no good until you have drawn some insights or conclusions from it, like what percentage of students may be interested in say, a singing activity, or say a debate, or maybe a poster presentation. Moreover, he might ideally want you to comment on the most popular activities or maybe the popularity of certain events by Gender. These additional expectations of the professor from your little “data-collection” activity you performed represent the last two steps in statistics namely interpreting and drawing conclusions. These two steps form the core or the basis of the framework for studying statistics. When you add these last two steps to your data collection activity, It qualifies as research, precisely because we're then able to take meaningful decisions from the findings extracted from the data.

2.2  Population

The next thing we want to talk about is population. The population is something we hear about very frequently in our daily lives. For example, the population of the most populous country in the world - China is 1.42 billion. That – roughly denotes what we mean by population in statistics. A Population is the complete set of elements in a group or entity we are looking at for study or research purposes. There can be different types of populations based on what we're looking at. For example, we can have a population of students in a class, which will include all the students in the class; or there can be a population of teachers within a school, which will include all the teachers in the school; or simply the population of a school, which will include everyone who works in the school; we can have a population of doctors in a given state, which will include all the doctors in the state., and so on. Thus, a population is simply the complete set of elements being studied. However, do you think it is always feasible or viable to study each element of a population? Just imagine the government of India, a country with a population of 1.38 billion wants to know how its citizens perceive a new health care law and plan to go about asking every citizen about it! Well, that would be almost unthinkable. Let's come down a few degrees and consider an example of a school administration that wants to know their students’ perception of their teachers, and message to ask each of their 5000 students, that would still be a lot of students to ask. Or those election surveys, which predict election results, do you think they can approach everybody in the state population to ask their views on who is going to win? Would it not be better if the Government, the school administration, or the survey agency in the above examples choose a smaller subset of the population and then go ahead with their study? This small subset is called a sample and that's precisely the next keyword concept we want to know about.

2.3 Sample

The time, money, and resources required to study an entire population are usually a big constraint and force us to consider studying a smaller group or subset of the population. That subset is called a sample, and a sample is generally much smaller than a population. However, there could be circumstances, where studying the entire population may be a necessity. The most popular example of such a study where the entire population has to be studied, is a census, as discussed below.

2.4 Census

A census is the most prominent example, wherein the entire population has to be studied. That is where the government officials conducting the census survey approach every household and take stock of each member of that household in the country. However, ever wondered why the census takes place every 10 years and not every year - you guessed it right, it is because of the huge costs involved.

Another Condition where we want to study the entire population is generally where the population is small and well defined. For example, a class teacher may be interested in knowing the seat preferences of the students in his class. In this case, all the students in the class constitute the class population, and it is both possible and required to study the entire class as a population.

2.5 Representative Sampling?

So now that we have understood the importance of studying a smaller subset of a population rather than the entire population, would it not be good if we talk about what would be a good sample like? Imagine you were part of a college survey aimed at finding students’ preferences on whether to wear a uniform to college. What if while collecting data for the survey, you approach only your friends and acquaintances, or approach a closely-knit group, which regularly gets together in the college canteen and has very similar views to yours on the issue? The size of the group might be almost entirely equal to the number of people you want to survey. It would seem very convenient for you to approach them, but would they represent, the views of the entire population of students in the college?  The answer is no - the views of such a conveniently chosen sample, would not represent the views of the entire student population, and would bias our views and the college administration about the issue. To avoid such biases, we should Ideally try and collect a sample, as randomly as possible, without giving a thought to our preferences, or convenience. The collection of such a non-biased sample is called representative sampling, which should be considered the ideal sample. 

To be contd.. in part II

Comments

  1. Simple yet captivating!

    ReplyDelete
  2. Very useful for students

    ReplyDelete
  3. Dr. Shubham AgarwalJune 9, 2022 at 6:16 AM

    Fantastic job explaining such a complex topic with such simplicity, Dr. Dikshit! Excited for part II

    ReplyDelete
    Replies
    1. Thank you Dr. Agarwal! Work-in-Progress, Part II will be out soon.

      Delete

Post a Comment

Popular posts from this blog

2. A Few Keywords and Definitions for Understanding Statistics – II (contd.)

2. A Few Keywords and Definitions for Understanding Statistics – VI (contd.)

2. A Few Keywords and Definitions for Understanding Statistics – III (contd.)