Computer and Information Science

Posted: August 26th, 2021

HIGHER COLLEGES OF TECHNOLOGY

Computer and Information Science

Non-Exam Based Assessment

Cover Sheet

Course Name	Statistics and Probability	Course Code	CIS 2003
Assessment	Statistics Project	Handing out Week	Week 9
To be submitted in:	Week 14
Maximum Marks	85 Marks	Percentage of Final Grade.	25%

This assessment will assess the following Course learning outcomes:

	CLO1	CLO2	CLO3	CLO4	CLO5
Question No.	√	√	√	√	√
The entire project/case study/poster is designed and developed by me (and my team members). The proper citation has been used when I (and my team members) used other sources. No part of this project has been designed, developed, or written for me (and my team members) by a third party. I have a copy of this project in case the submitted copy is lost or damaged. None of the music/graphics/animation/video/images used in this project have violated the Copy Right/Patent/Intellectual Property rights of an individual, company, or Institution. Student Signature: Date: Student Signature: Date:

For Examiner’s Use Only

Question No.	A Project Report	B 50% Weighted marks of Project Report	C Oral/viva	D 50% weighted scores of Oral/viva	Total Marks=B +D
Marks Allocated	70	50%	15	50%
Student 1
Student 2
Student 3

Table of Contents

Non-Exam Based Assessment.. 1

Cover Sheet.. 1

This assessment will assess the following Course learning outcomes: 1

Part1. 2

Introduction.. 2

c. Summarize and present the Population using Descriptive Statistics (Mean, Median, Mode, Variance and Standard Deviation). 2

d. Present Population data graphically using box plot and histogram. 4

E. 4

f. Take a Sample of 15 from the above Population and use Descriptive Statistics (Mean, Median, Mode, Variance, and Standard Deviation) to summarize the sample. 5

. 5

g. Explain in a sentence or 2 which method of sampling was used by you for selecting the above sample. Explain this method clearly. 5

Part 2: 6

. Using the histogram constructed in Section Id, calculate Pearson’s coefficient of skew-ness and check for outliers. 6

c. Make a final comment 6

PART 3: 7

B. Population mean at a Confidence level 7

C. Repeat (Part II a) for samples of sizes 31 and 45. 8

D. Widths of the above confidence level. 9

e. Check if all of your confidence intervals contain the μ. 9

Part 4: 10

A. Research problem.. 10

B. Find the null and alternative hypotheses (H₀ and H₁) 10

Use a z test to make a conclusion. 10

Project Objectives:

To summarize data using descriptive statistics.
To check for normality of data
To estimate the population parameters from sample statistics using Confidence Levels for the normal data set.
To conduct hypothesis testing about the Mean.

Project Description:

The practical part of the project is described in sections I, II, III, and IV.
MS-Excel, Megastat Add-in program can be used to complete the tasks.
An appropriately formatted MS-Word report describing all the project tasks should be submitted at the end of the project.
This will be followed by an individual Viva/Oral defense.

Section I: Descriptive Statistics- Population Parameters and Sample Statistics (23Marks)

A Short Introduction. 2 Marks

Descriptive statistics refers to a brief descriptive coefficient that summarizes a data set that can be the sample population or the entire population. There are two broad categories of descriptive statistics that is the central tendency and spread or variability measures. The primary tendency measures indicate the central location of a data set and include the median, mean, and mode. On the other hand, the spread measures the range of a data set and comprises the variance, standard deviation, variables on the minimum and maximum, skewness, and kurtosis. This paper is divided into three sections that will describe different statistical measures. The first section is on descriptive statistics that are both measures of central tendency and spread measures. More so, a histogram is used to represent the dataset using excel. Part two will test data normality using a histogram. Finally, the third section will estimate the Mean of the Population, illustrate a hypothesis, and offer a summary.

Select A Data Set (No Two Groups Can Choose The Same Dataset) 2 Marks

Dubai Statistics Center https://www.dsc.gov.ae/en-us

The data set selected to show the iron 60 – 160 µg/dL levels in a person’s body.

Summarize and present the Population using Descriptive Statistics (Mean, Median, Mode, Variance, and Standard Deviation).

Table 1: Descriptive Statistics of the Population (Mean, Median, Mode, Variance and Standard Deviation)

Mean	136.516
Median	134.9
Mode	#N/A
Variance	3258.946781
Standard Deviation	57.08718578
Maximum	219.36
Minimum	42.61
Range	176.75
Skewness	-0.147151978
Kurtosis	-1.335204003
Count	60

Present Population data graphically using box plot and histogram.3Marks

Figure 1: Histogram on Iron Levels

Figure 2: Histogram on Iron levels

Table 2: Descriptive Statistics for Iron, 60 – 160 µg/dL

Name	Iron, 60 – 160 µg/dL
Fatma Ahmad Ali Khalfan	188.96
Fatma Ahmad Ramadhan	214.03
Fatma Nasser Hussain	190.57
Fatma Salem Ahmad	187.13
Fuad Ahmed Mohamed	45.51
Hajar Humaid Salem	135.54
Hana Hussain Essa	87.01
Hanadi Mubarak	156.5
Hanifa Abdulfatah Hassan	140.31
Hassan Al Jaidi	114.73
Hend Abdulla Mohammad	212.09
Hessa Thani Mubarak	57.34
Iman Hussain Abdulla	71.68
Juma Sayed Abdulla	108.97
Laila Ahmad Mohamed	173.15
Descriptive Statistics
Mean	138.9013
Median	140.31
Mode	#N/A
Variance	2937.609
Standard Deviation	54.19972
Maximum	214.03
Minimum	45.51
Range	168.52
Skewness	-0.28206
Kurtosis	-1.21811
Count	15

Compare And Comment on the Shape of the Histogram Obtained For The Above Data With General Shapes of Histograms. (Note: You May Refer Online To Find Out And Compare The Various Shapes of A Histogram.) 3Marks

The histogram results, as shown above in figure 2, is the bimodal shapes. The histogram has one peak at the center, and the other bars are gradually decreasing below the peak on the left and the right side.The bimodal shapeon the histogram indicates that the data is from two different sources. Furthermore, the shape shows that the two data sources should be separated, reviewed, and analyzed separately to draw statistical conclusions from the data population.

Take A Sample of 15 From The Above Population and Use Descriptive Statistics (Mean, Median, Mode, Variance, And Standard Deviation) To Summarize The Sample. 4 Marks
StateWhich Method of Sampling Was Used By You for Selecting the Above Sample

Explain clearly how you carried out sampling using this method. 5Marks

The testing technique applied to sample the Population is simple random sampling. Under this technique, each sample in the data set has an equal likelihood or possibility of being selected. The sample selected is meant to offer a fair representation of the entire data population, and in case of a variation, it is termed as a sampling error. The sampling method was selected because it is accurate, easy, straight-forward, unbiased, and offers an equal opportunity of every data subset to be sampled from the entire population. Finally, each data on iron levels has an equal opportunity of being selected; thus, the data set was randomly selected from the overall Population.

PART II: Checking the Normality of Population (6 Marks)

a. Read Pages 322-324 Of The Textbook (Provided In Bbl). 0Mark

b. Using the Histogram Constructed in Section Id, Calculate Pearson’s Coefficient of Skew-Ness and Check for Outliers. 5 Marks

The histogram in Section 1d is for the entire Population.

Table 3: Descriptive statistics and skew-ness of the Population as a whole

Mean	136.516
Median (Q2)	134.9
Mode	#N/A
Variance	3258.946781
Standard Deviation	57.08718578
Maximum	219.36
Minimum	42.61
Range	176.75
Skewness	-0.147151978
Kurtosis	-1.335204003
Count	60
1st Quartile(Q1)	85.9175
2nd Quartile (Q2)	134.9
3rd Quartile (Q3)	189.3625
IQR	103.445

0.084922736

The outliers are the values of the first and third quartiles which is given as follows;

Table 4: 1st and 3rd quartile

1st Quartile(Q1)	85.9175
3rd Quartile (Q3)	189.3625

d. Make a final comment (using the values/findings from the above step) about the normality of the Population.1 Mark

The data range is 176.75, and the Pearson coefficient is a positive value of 0.084922736. This implies that the normal curve of the Population is skewed to the right.

PART III: Estimation of Population Mean (20Marks)

For The Random Sample Of Size 15 Taken Above In PART 1, Make Both The PointAnd Interval Estimates For The PopulationMean At A Confidence Level Of 95%. 5 Marks

Sample size= 15

Standard deviation= 54.19972

Sample size mean= 138.9013

Confidence level = 95%

Z-test= 0.5

=131.9041462µ145.8984538

Repeat (Part II a)for samples of sizes 31 and 45. 10Marks

Table 5: Sample size 31

Sample size 31
Name	Iron, 60 – 160 µg/dL
Aaeda Abdulaziz E	177.09
Abdulla Ali Mohamed	123.38
Abdulla Juma Jaffar Hussain	182.93
Abdullah Khamis Mohamed	107.34
Abdulrasool Saleh	195.72
Abeer Rashed Saeed	205.69
Afra Belal Ismail	204.81
Ahlam Abdulla Mohamed	217
Ahmad Abdulla Ahmad	142.29
Ali Ahmed Sulaiman	114.38
Ali Mohamed Ali	112.46
Ali Mohammad Bin Tamim	146.68
Amal Hassan Abdulla	44.74
Amna Ali Saeed Mohammad	210.77
Asma Saleh Jaafar	118.52
Ayesha Abdulrazaq	202.56
Aysha Yousif Matar	55.21
Fatima Sulaiman Abdulla	172.93
Fatma Ahmad Ali Khalfan	188.96
Fatma Ahmad Ramadhan	214.03
Fatma Nasser Hussain	190.57
Fatma Salem Ahmad	187.13
Fuad Ahmed Mohamed	45.51
Hajar Humaid Salem	135.54
Hana Hussain Essa	87.01
Hanadi Mubarak	156.5
Hanifa Abdulfatah Hassan	140.31
Hassan Al Jaidi	114.73
Hend Abdulla Mohammad	212.09
Hessa Thani Mubarak	57.34
Iman Hussain Abdulla	71.68
Mean	146.3194
Median	146.68
Mode	#N/A
Variance	2938.378
Standard Deviation	54.20681
Maximum	217
Minimum	44.74
Range	172.26
Skewness	-0.4475
Kurtosis	-0.99975
Count	31
1st Quartile(Q1)	113.42
2nd Quartile (Q2)	146.68
3rd Quartile (Q3)	193.145
IQR	79.725
Persian Coefficient= 3(Mean- Median)/ standard deviation

PC=	-0.01996

Table 6: Sample Size 31

Sample size 45
Name	Iron, 60 – 160 µg/dL
Ayesha Abdulrazaq	202.56
Aysha Yousif Matar	55.21
Fatima Sulaiman Abdulla	172.93
Fatma Ahmad Ali Khalfan	188.96
Fatma Ahmad Ramadhan	214.03
Fatma Nasser Hussain	190.57
Fatma Salem Ahmad	187.13
Fuad Ahmed Mohamed	45.51
Hajar Humaid Salem	135.54
Hana Hussain Essa	87.01
Hanadi Mubarak	156.5
Hanifa Abdulfatah Hassan	140.31
Hassan Al Jaidi	114.73
Hend Abdulla Mohammad	212.09
Hessa Thani Mubarak	57.34
Iman Hussain Abdulla	71.68
Juma Sayed Abdulla	108.97
Laila Ahmad Mohamed	173.15
Latifa Abdulla Husain	42.61
Latifa Mubarak Bilal	172.67
Lubna Mohamed Sharif	123.64
Mahmood Ahmad Abdulrazzaq	207.56
Marwa Yousuf Mahmoud	134.26
Maryam Abdulhamid Moahmed	183.36
Maryam Hamad Obaid	76.75
Mira Ahmad Khalfan	126.64
Moammer Abdulla Saed Salem	210.95
Mohamed Ghanim	217.94
Mohammad Ali Jamil Al Balooshi	191.7
Mohammed Sulaiman Al Mehri	54.68
Mona Ahmed Mohamed Ali	181.81
Muna Ali Ahmad Mohammad	111.07
Nadia Abdulla Mohamed	52.73
Noora Ibrahim Mohaed Ahmad	95.58
Rashed Abdulla	219.36
Reem Omair Ali Omair	56.05
Saeed Mubarak Rashid	76.23
Saeed Salem Saeed	82.64
Sara Abdulla Ibrahim Mohamed	164.17
Sara Abdulla Mohamed Al Abdulla	108.94
Sarah Ali Abdulla Ahmad	43.52
Sulaiman Darwish	53.46
Zainab Ali Ebrahim Ali	197.76
Rahman Ali	78.57
Muna Moammar	108.29
Mean	130.8258
Median	126.64
Mode	#N/A
Variance	3430.175
Standard Deviation	58.5677
Maximum	219.36
Minimum	42.61
Range	176.75
Skewness	-0.00951
Kurtosis	-1.46653
Count	45
1st Quartile(Q1)	76.75
2nd Quartile (Q2)	126.64
3rd Quartile (Q3)	187.13
IQR	110.38
Persian Coefficient= 3(Mean- Median)/ standard deviation
PC=	0.214407

Do the widths of the above three Confidence Intervals differ?

Comment on this. 2Marks

The widths at the two confidence level intervals at 95% on the iron level have a difference. When the lowest values are subtracted at the highest levels, a difference of 13.99430753 is generated (145.8984538-131.9041462).

From the value of the Population mean calculated in PART,I check if all of your confidence intervals contain the. Elaborate your answer. 3Marks

All the confidence levels contain the population mean.

At 95% confidence level the Population mean =131.9041462µ145.8984538

PART IV: Hypothesis Testing (16 Marks)

Write Down A Clear ‘Research Problem’(This Should Be A Clear Statement Mentioning The Variable And The Hypothesis Test You Intend To Perform On It.) 3 Marks

In this study, a research problem was stated and conducted to find the iron levels in a person’s body. The population sample size was 60 people, with the average iron levels of 136.516. The population standard deviation is 57.08718578. A sample of 15 people was carried out to determine the significance level of 95%. The sample mean of the iron levels is 138.9013333. The research will establish if these iron levels are significant to the body.

Perform Hypothesis Testing For The Research Problem. UseYour Own Confidence/Significance Level, Type Of Test And Any Other Data You Need To Conduct The Hypothesis Test. Show All Your Work Clearly, Including Calculations, Sketch Of The Normal Curve, The Chosen Test And The Values Of Level Of Significance, Critical Value, Etc. 10 Marks

Hypothesis testing

H_0:µ = 60

H_1:µ = 60

This is a two-tailed test, where the z-test is used to test the hypothesis;

Sample size= 15

Standard deviation= 54.19972

Sample size mean= 138.9013

Confidence level = 95%

Z-test= 0.5

=131.9041462µ145.8984538

Table 7: 2-tailed Normal Distribution

Summarize the results (What do the test results indicate – Reject the Null hypothesis OR Fail to reject the Null Hypothesis?).Write a Final Conclusion connecting your results obtained above with the “Research Problem” stated in Part a of this section. 3Marks

The value of the Z-test lies within the unacceptable region. Thus, accept H1 and reject H0. Hence, with confidence or significance level of 95%, most people accept the study’s objectives. The 95% confidence level implies that the iron levels are significant to the body.

Project Report Format (5 marks)

Cover page, page numbers, table of contents, and appendices if needed.3Marks
Include a ConclusionsSection at the end summarizing the results of the two Inferential Statistics methods used by you in this study. 2 Marks

______________________________________________________________________________________

Expert paper writers are just a few clicks away

Place an order in 3 easy steps. Takes less than 5 mins.

Calculate the price of your order

Type of paper needed:

Pages:

You will get a personal manager and a discount.

Academic level:

We'll send you the first draft for approval by at

Total price:

$0.00