1 Basic Concepts Basic Concepts of of Statistical Studies Statistical Studies2 Introduction Introduction n Decision makers make better decisions Decision makers make better decisions when they use all available information when they use all available information in an effective and meaningful way. The in an effective and meaningful way. The primary role of statistics is to to provide primary role of statistics is to to provide decision makers with methods for decision makers with methods for obtaining and analyzing information to obtaining and analyzing information to help make these decisions. Statistics is help make these decisions. Statistics is used to answer long-range planning used to answer long-range planning questions, such as when and where to questions, such as when and where to locate facilities to handle future sales. locate facilities to handle future sales.3 Definition Definition n Statistics is defined as the Statistics is defined as the science of collecting, science of collecting, organizing, presenting, organizing, presenting, analyzing and interpreting analyzing and interpreting numerical data for the purpose numerical data for the purpose of assisting in making a more of assisting in making a more effective decision. effective decision. 4 Applications in Management Applications in Management n Accounting Accounting n Economics Economics Public accounting firms use statistical Public accounting firms use statistical sampling procedures when conducting sampling procedures when conducting audits for their clients. audits for their clients. Economists use statistical information Economists use statistical information in making forecasts about the future of in making forecasts about the future of the economy or some aspect of it. the economy or some aspect of it.5 Applications in Management Applications in Management A variety of statistical quality A variety of statistical quality control charts are used to monitor control charts are used to monitor the output of a production process. the output of a production process. n Production Production Electronic point-of-sale scanners at Electronic point-of-sale scanners at retail checkout counters are used to retail checkout counters are used to collect data for a variety of marketing collect data for a variety of marketing research applications. research applications. n Marketing Marketing6 Types of Statistics Types of Statistics n There are two types of statistics There are two types of statistics 1. 1. Descriptive Statistics Descriptive Statistics is concerned with is concerned with summary calculations, graphs, charts and summary calculations, graphs, charts and tables. tables. 2. 2. Inferential Statistics Inferential Statistics is a method used to is a method used to generalize from a sample to a population. For generalize from a sample to a population. For example, the average income of all families example, the average income of all families (the population) in India can be estimated (the population) in India can be estimated from figures obtained from a few hundred from figures obtained from a few hundred (the sample) families. (the sample) families. 7 Statistical Population Statistical Population n A Population Population is a collection of all is a collection of all distinct individuals or objects or distinct individuals or objects or items under study. The number of items under study. The number of entities in a population, Called the entities in a population, Called the Population Size, is denoted by N Population Size, is denoted by N n A descriptive measure of a A descriptive measure of a population is called a population is called a Parameter Parameter8 Sample Sample n A Sample Sample is a part of a is a part of a population and the sample size population and the sample size is denoted by n. A sample is denoted by n. A sample should be a representative of should be a representative of the population. the population. n A descriptive measure of a A descriptive measure of a population is called a population is called a Statistic Statistic9 Data and Data Sets Data and Data Sets n Data Data are the facts and figures collected, summarized, are the facts and figures collected, summarized, analyzed, and interpreted. analyzed, and interpreted. The data collected in a particular study are referred The data collected in a particular study are referred to as the to as the data set data set.10 The The elements elements are the entities on which data are are the entities on which data are collected. collected. A variable variable is a characteristic of interest for the elements. is a characteristic of interest for the elements. The set of measurements collected for a particular The set of measurements collected for a particular element is called an element is called an observation observation. The total number of data values in a complete data The total number of data values in a complete data set is the number of elements multiplied by the set is the number of elements multiplied by the number of variables. number of variables. Elements, Variables, and Observations Elements, Variables, and Observations11 Stock Annual Earn/Stock Annual Earn/Exchange Sales($M) Share($) Exchange Sales($M) Share($) Data, Data Sets, Data, Data Sets, Elements, Variables, and Observations Elements, Variables, and Observations Company Company Dataram Dataram EnergySouth EnergySouth Keystone Keystone LandCare LandCare Psychemedics Psychemedics NQ NQ 73.10 73.10 0.86 0.86 N 74.00 74.00 1.67 1.67 N 365.70 365.70 0.86 0.86 NQ NQ 111.40 111.40 0.33 0.33 N 17.60 17.60 0.13 0.13 Variables Variables Element Element Names Names Data Set Data Set Observation Observation12 Scales of Measurement Scales of Measurement The scale indicates the data summarization and The scale indicates the data summarization and statistical analyses that are most appropriate. statistical analyses that are most appropriate. The scale determines the amount of information The scale determines the amount of information contained in the data. contained in the data. Scales of measurement include: Scales of measurement include: Nominal Nominal Ordinal Ordinal Interval Interval Ratio Ratio13 Scales of Measurement Scales of Measurement n Nominal Nominal A nonnumeric label nonnumeric label or or numeric code numeric code may be used. may be used. Data are Data are labels or names labels or names used to identify an used to identify an attribute of the element. attribute of the element.14 Example: Example: Students of a university are classified by the Students of a university are classified by the school in which they are enrolled using a school in which they are enrolled using a nonnumeric label such as Business, Humanities, nonnumeric label such as Business, Humanities, Education, and so on. Education, and so on. Alternatively, a numeric code could be used for Alternatively, a numeric code could be used for the school variable (e.g. 1 denotes Business, the school variable (e.g. 1 denotes Business, 2 denotes Humanities, 3 denotes Education, and 2 denotes Humanities, 3 denotes Education, and so on). so on). Scales of Measurement Scales of Measurement n Nominal Nominal15 Scales of Measurement Scales of Measurement n Ordinal Ordinal A nonnumeric label nonnumeric label or or numeric code numeric code may be used. may be used. The data have the properties of nominal data and The data have the properties of nominal data and the the order or rank of the data is meaningful order or rank of the data is meaningful.16 Scales of Measurement Scales of Measurement n Ordinal Ordinal Example: Example: Students of a university are classified by their Students of a university are classified by their class standing using a nonnumeric label such as class standing using a nonnumeric label such as Freshman, Junior, or Senior. Freshman, Junior, or Senior. Alternatively, a numeric code could be used for Alternatively, a numeric code could be used for the class standing variable (e.g. 1 denotes the class standing variable (e.g. 1 denotes Freshman, 2 denotes Juniors and so on). Freshman, 2 denotes Juniors and so on).17 Scales of Measurement Scales of Measurement n Interval Interval Interval data are Interval data are always numeric always numeric. The data have the properties of ordinal data, and The data have the properties of ordinal data, and the interval between observations is expressed in the interval between observations is expressed in terms of a fixed unit of measure. terms of a fixed unit of measure.18 Scales of Measurement Scales of Measurement n Interval Interval Example: Example: Shruti has an MAT score of 605, while Raj Shruti has an MAT score of 605, while Raj has an MAT score of 655. Raj scored 50 has an MAT score of 655. Raj scored 50 points more than Shruti. points more than Shruti.19 Scales of Measurement Scales of Measurement n Ratio Ratio The data have all the properties of interval data The data have all the properties of interval data and the and the ratio of two values is meaningful ratio of two values is meaningful. Variables such as distance, height, weight, and time Variables such as distance, height, weight, and time use the ratio scale. use the ratio scale. This This scale must contain a zero value scale must contain a zero value that indicates that indicates that nothing exists for the variable at the zero point. that nothing exists for the variable at the zero point.20 Scales of Measurement Scales of Measurement n Ratio Ratio Example: Example: Raj’s college record shows 36 credit hours Raj’s college record shows 36 credit hours earned, while Kevin’s record shows 72 credit earned, while Kevin’s record shows 72 credit hours earned. Kevin has twice as many credit hours earned. Kevin has twice as many credit hours earned as hours earned as Raj’s Raj’s.21 Data can be further classified as being qualitative Data can be further classified as being qualitative or quantitative. or quantitative. The statistical analysis that is appropriate depends The statistical analysis that is appropriate depends on whether the data for the variable are qualitative on whether the data for the variable are qualitative or quantitative. or quantitative. In general, there are more alternatives for statistical In general, there are more alternatives for statistical analysis when the data are quantitative. analysis when the data are quantitative. Qualitative and Quantitative Data Qualitative and Quantitative Data22 Qualitative Data Qualitative Data Labels or names Labels or names used to identify an attribute of each used to identify an attribute of each element element Often referred to as Often referred to as categorical data categorical data Use either the nominal or ordinal scale of Use either the nominal or ordinal scale of measurement measurement Can be either numeric or nonnumeric Can be either numeric or nonnumeric Appropriate statistical analyses are rather limited Appropriate statistical analyses are rather limited23 Quantitative Data Quantitative Data Quantitative data indicate Quantitative data indicate how many or how much: how many or how much: discrete discrete, if measuring how many , if measuring how many continuous continuous, if measuring how much , if measuring how much Quantitative data are Quantitative data are always numeric always numeric. Ordinary arithmetic operations are meaningful for Ordinary arithmetic operations are meaningful for quantitative data. quantitative data.24 Scales of Measurement Scales of Measurement Qualitative Qualitative Quantitative Quantitative Numerical Numerical Numerical Numerical Non-numerical Non-numerical Data Data Nominal Nominal Ordinal Ordinal Nominal Nominal Ordinal Ordinal Interval Interval Ratio Ratio25 Cross-Sectional Data Cross-Sectional Data Cross-sectional data Cross-sectional data are collected at the same or are collected at the same or approximately the same point in time. approximately the same point in time. Example Example: data detailing the number of building : data detailing the number of building permits issued in June 2007 in each of the Districts permits issued in June 2007 in each of the Districts of UP of UP26 Time Series Data Time Series Data Time series data Time series data are collected over several time are collected over several time periods. periods. Example Example: data detailing the number of building : data detailing the number of building permits issued in Districts of UP in each of permits issued in Districts of UP in each of the last 36 months the last 36 months27 Data Sources Data Sources n Existing Sources Existing Sources Within a firm Within a firm – almost any department – almost any department Business database services Business database services – Dow Jones & Co. – Dow Jones & Co. Government agencies Government agencies -Department of Labor -Department of Labor Industry associations Industry associations – Travel Industry Association – Travel Industry Association Special-interest organizations Special-interest organizations – Graduate Management – Graduate Management Admission Council Admission Council Internet Internet – more and more firms – more and more firms28 Descriptive Statistics Descriptive Statistics n Descriptive statistics Descriptive statistics are the tabular, graphical, and are the tabular, graphical, and numerical methods used to numerical methods used to summarize and present summarize and present data. data.29 Example: Hudson Auto Repair Example: Hudson Auto Repair The manager of Hudson Auto The manager of Hudson Auto would like to have a better would like to have a better understanding of the cost understanding of the cost of parts used in the engine of parts used in the engine tune-ups performed in the tune-ups performed in the shop. She examines 50 shop. She examines 50 customer invoices for tune-ups. The costs of parts, customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar, are listed on the next rounded to the nearest dollar, are listed on the next slide. slide.30 Example: Hudson Auto Repair Example: Hudson Auto Repair n Sample of Parts Cost ($) for 50 Tune-ups Sample of Parts Cost ($) for 50 Tune-ups 91 78 93 57 75 52 99 80 97 62 71 69 72 89 66 75 79 75 72 76 104 74 62 68 97 105 77 65 80 109 85 97 88 68 83 68 71 69 67 74 62 82 98 101 79 105 79 69 62 73 91 78 93 57 75 52 99 80 97 62 71 69 72 89 66 75 79 75 72 76 104 74 62 68 97 105 77 65 80 109 85 97 88 68 83 68 71 69 67 74 62 82 98 101 79 105 79 69 62 7331 Tabular Summary: Tabular Summary: Frequency and Percent Frequency Frequency and Percent Frequency 50-59 50-59 60-69 60-69 70-79 70-79 80-89 80-89 90-99 90-99 100-109 100-109 2 13 13 16 16 7 7 5 50 50 4 26 26 32 32 14 14 14 14 10 10 100 100 (2/50)100 (2/50)100 Parts Parts Cost ($) Cost ($) Parts Parts Frequency Frequency Percent Percent Frequency Frequency32 Graphical Summary: Histogram Graphical Summary: Histogram 2468 10 12 14 16 18 Parts Cost ($) Frequency 50-59 60-69 70-79 80-89 90-99 100-110 Tune-up Parts Cost Tune-up Parts Cost33 Numerical Descriptive Statistics Numerical Descriptive Statistics Hudson’s average cost of parts, based on the 50 Hudson’s average cost of parts, based on the 50 tune-ups studied, is $79 (found by summing the tune-ups studied, is $79 (found by summing the 50 cost values and then dividing by 50). 50 cost values and then dividing by 50). The most common numerical descriptive statistic The most common numerical descriptive statistic is the is the average average (or (or mean mean). ).34 Statistical Inference Statistical Inference Population Population Sample Sample Statistical inference Statistical inference Census Census Sample survey Sample survey- the set of all elements of interest in a the set of all elements of interest in a particular study particular study - a subset of the population a subset of the population - the process of using data obtained the process of using data obtained from a sample to make estimates from a sample to make estimates and test hypotheses about the and test hypotheses about the characteristics of a population characteristics of a population - collecting data for a population collecting data for a population - collecting data for a sample collecting data for a sample35 Process of Statistical Inference Process of Statistical Inference 1. Population . Population consists of all tune-consists of all tuneupps Average cost of ups. Average cost of parts is unknown parts is unknown. 2. A sample of 50 . A sample of 50 engine tune-ups engine tune-ups is examined. is examined. 1. 1. The sample data The sample data provide a sample provide a sample average parts cost average parts cost of $79 per tune-up. of $79 per tune-up. 4. The sample average . The sample average is used to estimate the is used to estimate the population average. population average.36 Computers and Statistical Computers and Statistical Analysis Analysis Statistical analysis typically involves working with Statistical analysis typically involves working with large amounts of data large amounts of data. Computer software Computer software is typically used to conduct the is typically used to conduct the analysis. analysis. Instructions are provided in chapter appendices for Instructions are provided in chapter appendices for carrying out many of the statistical procedures carrying out many of the statistical procedures using Minitab and Excel. using Minitab and Excel.