Computer and Information Science

Posted: August 26th, 2021

HIGHER COLLEGES OF TECHNOLOGY

Computer and Information Science

Non-Exam Based Assessment

Cover Sheet

Course Name Statistics and Probability Course Code CIS 2003
Assessment Statistics Project Handing out Week Week 9
To be submitted in: Week 14
Maximum Marks 85 Marks Percentage of Final Grade. 25%

This assessment will assess the following Course learning outcomes:

  CLO1 CLO2 CLO3 CLO4 CLO5  
Question No.  
The entire project/case study/poster is designed and developed by me (and my team members). The proper citation has been used when I (and my team members) used other sources. No part of this project has been designed, developed, or written for me (and my team members) by a third party. I have a copy of this project in case the submitted copy is lost or damaged. None of the music/graphics/animation/video/images used in this project have violated the Copy Right/Patent/Intellectual Property rights of an individual, company, or Institution. Student Signature:                                                                Date: Student Signature:                                                                Date:
 

For Examiner’s Use Only

Question No. A Project Report B 50% Weighted marks of Project Report C Oral/viva D 50% weighted scores of Oral/viva Total Marks=B +D
Marks Allocated 70 50% 15 50%  
Student 1          
Student 2          
Student 3          

Table of Contents

Non-Exam Based Assessment.. 1

Cover Sheet.. 1

This assessment will assess the following Course learning outcomes: 1

Part1. 2

Introduction.. 2

c. Summarize and present the Population using Descriptive Statistics (Mean, Median, Mode, Variance and Standard Deviation). 2

d. Present Population data graphically using box plot and histogram. 4

E. 4

f. Take a Sample of 15 from the above Population and use Descriptive Statistics (Mean, Median, Mode, Variance, and Standard Deviation) to summarize the sample. 5

. 5

g. Explain in a sentence or 2 which method of sampling was used by you for selecting the above sample. Explain this method clearly. 5

Part 2: 6

. Using the histogram constructed in Section Id, calculate Pearson’s coefficient of skew-ness and check for outliers. 6

c. Make a final comment 6

PART 3: 7

B. Population mean at a Confidence level 7

C. Repeat (Part II a) for samples of sizes 31 and 45. 8

D. Widths of the above confidence level. 9

e. Check if all of your confidence intervals contain the μ. 9

Part 4: 10

A.     Research problem.. 10

B.     Find the null and alternative hypotheses (H0 and H1) 10

Use a z test to make a conclusion. 10

Project Objectives:

  1. To summarize data using descriptive statistics.
  2. To check for normality of data
  3. To estimate the population parameters from sample statistics using Confidence Levels for the normal data set.
  4. To conduct hypothesis testing about the Mean.

Project Description:

  • The practical part of the project is described in sections I, II, III, and IV.
  • MS-Excel, Megastat Add-in program can be used to complete the tasks.
  • An appropriately formatted MS-Word report describing all the project tasks should be submitted at the end of the project.
  • This will be followed by an individual Viva/Oral defense.

Section I: Descriptive Statistics- Population Parameters and Sample Statistics (23Marks)

  1. A Short Introduction.                                                                                                                           2 Marks

Descriptive statistics refers to a brief descriptive coefficient that summarizes a data set that can be the sample population or the entire population. There are two broad categories of descriptive statistics that is the central tendency and spread or variability measures. The primary tendency measures indicate the central location of a data set and include the median, mean, and mode. On the other hand, the spread measures the range of a data set and comprises the variance, standard deviation, variables on the minimum and maximum, skewness, and kurtosis. This paper is divided into three sections that will describe different statistical measures. The first section is on descriptive statistics that are both measures of central tendency and spread measures. More so, a histogram is used to represent the dataset using excel. Part two will test data normality using a histogram. Finally, the third section will estimate the Mean of the Population, illustrate a hypothesis, and offer a summary.

  • Select A Data Set (No Two Groups Can Choose The Same Dataset)                                 2 Marks

Dubai Statistics Center https://www.dsc.gov.ae/en-us

The data set selected to show the iron 60 – 160 µg/dL levels in a person’s body.

  • Summarize and present the Population using Descriptive Statistics (Mean, Median, Mode, Variance, and Standard Deviation).

Table 1: Descriptive Statistics of the Population (Mean, Median, Mode, Variance and Standard Deviation)

Mean 136.516
Median 134.9
Mode #N/A
Variance 3258.946781
Standard Deviation 57.08718578
Maximum 219.36
Minimum 42.61
Range 176.75
Skewness -0.147151978
Kurtosis -1.335204003
Count 60
  • Present Population data graphically using box plot and histogram.3Marks

Figure 1: Histogram on Iron Levels

Figure 2: Histogram on Iron levels

Table 2: Descriptive Statistics for Iron, 60 – 160 µg/dL

Name Iron, 60 – 160 µg/dL
Fatma Ahmad Ali Khalfan 188.96
Fatma Ahmad Ramadhan 214.03
Fatma Nasser Hussain 190.57
Fatma Salem Ahmad 187.13
Fuad Ahmed Mohamed 45.51
Hajar Humaid Salem 135.54
Hana Hussain Essa 87.01
Hanadi Mubarak 156.5
Hanifa Abdulfatah Hassan 140.31
Hassan Al Jaidi 114.73
Hend Abdulla Mohammad 212.09
Hessa Thani Mubarak 57.34
Iman Hussain Abdulla 71.68
Juma Sayed Abdulla 108.97
Laila Ahmad Mohamed 173.15
Descriptive Statistics
Mean 138.9013
Median 140.31
Mode #N/A
Variance 2937.609
Standard Deviation 54.19972
Maximum 214.03
Minimum 45.51
Range 168.52
Skewness -0.28206
Kurtosis -1.21811
Count 15
  • Compare And Comment on the Shape of the Histogram Obtained For The Above Data With General Shapes of Histograms. (Note: You May Refer Online To Find Out And Compare The Various Shapes of A Histogram.)                                                                                                                                             3Marks

The histogram results, as shown above in figure 2, is the bimodal shapes. The histogram has one peak at the center, and the other bars are gradually decreasing below the peak on the left and the right side.The bimodal shapeon the histogram indicates that the data is from two different sources. Furthermore, the shape shows that the two data sources should be separated, reviewed, and analyzed separately to draw statistical conclusions from the data population.

  • Take A Sample of 15 From The Above Population and Use Descriptive Statistics (Mean, Median, Mode, Variance, And Standard Deviation) To Summarize The Sample.                                                   4 Marks
  • StateWhich Method of Sampling Was Used By You for Selecting the Above Sample

Explain clearly how you carried out sampling using this method.                                                                 5Marks

The testing technique applied to sample the Population is simple random sampling. Under this technique, each sample in the data set has an equal likelihood or possibility of being selected.  The sample selected is meant to offer a fair representation of the entire data population, and in case of a variation, it is termed as a sampling error. The sampling method was selected because it is accurate, easy, straight-forward, unbiased, and offers an equal opportunity of every data subset to be sampled from the entire population. Finally, each data on iron levels has an equal opportunity of being selected; thus, the data set was randomly selected from the overall Population.

PART II: Checking the Normality of Population (6 Marks)

a. Read Pages 322-324 Of The Textbook (Provided In Bbl).        0Mark

b. Using the Histogram Constructed in Section Id, Calculate Pearson’s Coefficient of Skew-Ness and Check for Outliers.   5 Marks

The histogram in Section 1d is for the entire Population.

Table 3: Descriptive statistics and skew-ness of the Population as a whole

Mean 136.516
Median (Q2) 134.9
Mode #N/A
Variance 3258.946781
Standard Deviation 57.08718578
Maximum 219.36
Minimum 42.61
Range 176.75
Skewness -0.147151978
Kurtosis -1.335204003
Count 60
1st Quartile(Q1) 85.9175
2nd Quartile (Q2) 134.9
3rd Quartile (Q3) 189.3625
IQR 103.445

0.084922736

The outliers are the values of the first and third quartiles which is given as follows;

Table 4: 1st and 3rd quartile

1st Quartile(Q1) 85.9175
3rd Quartile (Q3) 189.3625

d. Make a final comment (using the values/findings from the above step) about the normality of the Population.1 Mark

The data range is 176.75, and the Pearson coefficient is a positive value of 0.084922736. This implies that the normal curve of the Population is skewed to the right.

PART III: Estimation of Population Mean (20Marks)

  1. For The Random Sample Of Size 15 Taken Above In PART 1, Make Both The PointAnd Interval Estimates For The PopulationMean At A Confidence Level Of 95%. 5 Marks

Sample size= 15

Standard deviation= 54.19972

Sample size mean= 138.9013

Confidence level = 95%

Z-test= 0.5

=131.9041462µ145.8984538

  • Repeat (Part II a)for samples of sizes 31 and 45.                                                                              10Marks

Table 5: Sample size 31

Sample size 31
 Name Iron,
60 – 160 µg/dL
Aaeda Abdulaziz E 177.09
Abdulla Ali Mohamed 123.38
Abdulla Juma Jaffar Hussain 182.93
Abdullah Khamis Mohamed 107.34
Abdulrasool Saleh 195.72
Abeer Rashed Saeed 205.69
Afra Belal Ismail 204.81
Ahlam Abdulla Mohamed 217
Ahmad Abdulla Ahmad 142.29
Ali Ahmed Sulaiman 114.38
Ali Mohamed Ali 112.46
Ali Mohammad Bin Tamim 146.68
Amal Hassan Abdulla 44.74
Amna Ali Saeed Mohammad 210.77
Asma Saleh Jaafar 118.52
Ayesha Abdulrazaq 202.56
Aysha Yousif Matar 55.21
Fatima Sulaiman Abdulla 172.93
Fatma Ahmad Ali Khalfan 188.96
Fatma Ahmad Ramadhan 214.03
Fatma Nasser Hussain 190.57
Fatma Salem Ahmad 187.13
Fuad Ahmed Mohamed 45.51
Hajar Humaid Salem 135.54
Hana Hussain Essa 87.01
Hanadi Mubarak 156.5
Hanifa Abdulfatah Hassan 140.31
Hassan Al Jaidi 114.73
Hend Abdulla Mohammad 212.09
Hessa Thani Mubarak 57.34
Iman Hussain Abdulla 71.68
Mean 146.3194
Median 146.68
Mode #N/A
Variance 2938.378
Standard Deviation 54.20681
Maximum 217
Minimum 44.74
Range 172.26
Skewness -0.4475
Kurtosis -0.99975
Count 31
1st Quartile(Q1) 113.42
2nd Quartile (Q2) 146.68
3rd Quartile (Q3) 193.145
IQR 79.725
Persian Coefficient= 3(Mean- Median)/ standard deviation
PC= -0.01996

Table 6: Sample Size 31

Sample size 45
Name Iron,
60 – 160 µg/dL
Ayesha Abdulrazaq 202.56
Aysha Yousif Matar 55.21
Fatima Sulaiman Abdulla 172.93
Fatma Ahmad Ali Khalfan 188.96
Fatma Ahmad Ramadhan 214.03
Fatma Nasser Hussain 190.57
Fatma Salem Ahmad 187.13
Fuad Ahmed Mohamed 45.51
Hajar Humaid Salem 135.54
Hana Hussain Essa 87.01
Hanadi Mubarak 156.5
Hanifa Abdulfatah Hassan 140.31
Hassan Al Jaidi 114.73
Hend Abdulla Mohammad 212.09
Hessa Thani Mubarak 57.34
Iman Hussain Abdulla 71.68
Juma Sayed Abdulla 108.97
Laila Ahmad Mohamed 173.15
Latifa Abdulla Husain 42.61
Latifa Mubarak Bilal 172.67
Lubna Mohamed Sharif 123.64
Mahmood Ahmad Abdulrazzaq 207.56
Marwa Yousuf Mahmoud 134.26
Maryam Abdulhamid Moahmed 183.36
Maryam Hamad Obaid 76.75
Mira Ahmad Khalfan 126.64
Moammer Abdulla Saed Salem 210.95
Mohamed Ghanim 217.94
Mohammad Ali Jamil Al Balooshi 191.7
Mohammed Sulaiman Al Mehri 54.68
Mona Ahmed Mohamed Ali 181.81
Muna Ali Ahmad Mohammad 111.07
Nadia Abdulla Mohamed 52.73
Noora Ibrahim Mohaed Ahmad 95.58
Rashed Abdulla 219.36
Reem Omair Ali Omair 56.05
Saeed Mubarak Rashid 76.23
Saeed Salem Saeed 82.64
Sara Abdulla Ibrahim Mohamed 164.17
Sara Abdulla Mohamed Al Abdulla 108.94
Sarah Ali Abdulla Ahmad 43.52
Sulaiman Darwish 53.46
Zainab Ali Ebrahim Ali 197.76
Rahman Ali 78.57
Muna Moammar 108.29
Mean 130.8258
Median 126.64
Mode #N/A
Variance 3430.175
Standard Deviation 58.5677
Maximum 219.36
Minimum 42.61
Range 176.75
Skewness -0.00951
Kurtosis -1.46653
Count 45
1st Quartile(Q1) 76.75
2nd Quartile (Q2) 126.64
3rd Quartile (Q3) 187.13
IQR 110.38
Persian Coefficient= 3(Mean- Median)/ standard deviation
PC= 0.214407
  • Do the widths of the above three Confidence Intervals differ?

Comment on this.                                                                                                                                2Marks

The widths at the two confidence level intervals at 95% on the iron level have a difference. When the lowest values are subtracted at the highest levels, a difference of 13.99430753 is generated (145.8984538-131.9041462).

  • From the value of the Population mean calculated in PART,I check if all of your confidence intervals contain the.  Elaborate your answer.                                                                                                 3Marks

All the confidence levels contain the population mean.

At 95% confidence level the Population mean =131.9041462µ145.8984538

PART IV: Hypothesis Testing (16 Marks)

  1. Write Down A Clear ‘Research Problem’(This Should Be A Clear Statement Mentioning The Variable And The Hypothesis Test You Intend To Perform On It.)                                                               3 Marks

In this study, a research problem was stated and conducted to find the iron levels in a person’s body. The population sample size was 60 people, with the average iron levels of 136.516. The population standard deviation is 57.08718578. A sample of 15 people was carried out to determine the significance level of 95%. The sample mean of the iron levels is 138.9013333. The research will establish if these iron levels are significant to the body.

  • Perform Hypothesis Testing For The Research Problem. UseYour Own Confidence/Significance Level, Type Of Test And Any Other Data You Need To Conduct The Hypothesis Test. Show All Your Work Clearly, Including Calculations, Sketch Of The Normal Curve, The Chosen Test And The Values Of Level Of Significance, Critical Value, Etc.                                                                                                 10 Marks

Hypothesis testing

                        H0:  µ = 60

            H1: µ = 60

This is a two-tailed test, where the z-test is used to test the hypothesis;

Sample size= 15

Standard deviation= 54.19972

Sample size mean= 138.9013

Confidence level = 95%

Z-test= 0.5

=131.9041462µ145.8984538

Table 7: 2-tailed Normal Distribution

  • Summarize the results (What do the test results indicate – Reject the Null hypothesis   OR Fail to reject the Null Hypothesis?).Write a Final Conclusion connecting your results obtained above with the “Research Problem” stated in Part a of this section.                                                                                             3Marks

The value of the Z-test lies within the unacceptable region. Thus, accept H1 and reject H0. Hence, with confidence or significance level of 95%, most people accept the study’s objectives. The 95% confidence level implies that the iron levels are significant to the body.

Project Report Format (5 marks)

  • Cover page, page numbers, table of contents, and appendices if needed.3Marks
  • Include a ConclusionsSection at the end summarizing the results of the two Inferential Statistics methods used by you in this study.                     2 Marks

______________________________________________________________________________________

Expert paper writers are just a few clicks away

Place an order in 3 easy steps. Takes less than 5 mins.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00