AIOU Solved Assignments 2 Code 8614 Autumn & Spring 2023
AIOU Solved Assignments Code 8614 Autumn & Spring 2023
AIOU Solved Assignments 1 & 2 Code 8614 Autumn & Spring 2023. Solved Assignments code 8614 Educational Statistics 2023. Allama iqbal open university old papers.
Assignment No. 2
Autumn & Spring 2023
Educational Statistics (8614)
Q.1Whatisdatacleaning?Writedownitsimportanceandbenefits.Howtoensureitbefore analysisofdata?
DataCleaning
‘Cleaning’referstotheprocessofremovinginvaliddatapointsfromadataset.
Manystatisticalanalysestrytofindapatterninadataseries,basedonahypothesisor assumptionaboutthenatureofthedata.’Cleaning’istheprocessofremovingthose datapointswhichareeither(a)Obviouslydisconnectedwiththeeffectorassumption whichwearetryingtoisolate,duetosomeotherfactorwhichappliesonlytothose particulardatapoints.(b)Obviouslyerroneous,i.e.someexternalerrorisreflectedin thatparticulardatapoint,eitherduetoamistakeduringdatacollection,reportingetc.
Intheprocessweignoretheseparticulardatapoints,andconductouranalysisonthe remainingdata.
‘Cleaning’frequentlyinvolveshumanjudgementtodecidewhichpointsarevalidand whicharenot,andthereisachanceofvaliddatapointscausedbysomeeffectnot sufficientlyaccountedforinthehypothesis/assumptionbehindtheanalyticalmethod applied.
Thepointstobecleanedaregenerallyextremeoutliers.’Outliers’arethosepointswhich standoutfornotfollowingapatternwhichisgenerallyvisibleinthedata.Onewayof detectingoutliersistoplotthedatapoints(ifpossible)andvisuallyinspectthe resultantplotforpointswhichliefaroutsidethegeneraldistribution.Anotherwayisto runtheanalysisontheentiredataset,andtheneliminatingthosepointswhichdonot meetmathematical’controllimits’forvariabilityfromatrend,andthenrepeatingthe analysisontheremainingdata.
Cleaningmayalsobedonejudgementally,forexampleinasalesforecastbyignoring
historicaldatafromanarea/unitwhichhasatendencytomisreportsalesfigures.To takeanotherexample,inadoubleblindmedicaltestadoctormaydisregardtheresults ofavolunteerwhomthedoctorhappenstoknowinanon-professionalcontext.
‘Cleaning’mayalsosometimesbeusedtorefertovariousother judgemental/mathematicalmethodsofvalidatingdataandremovingsuspectdata.
Theimportanceofhavingcleanandreliabledatainanystatisticalanalysiscannotbe stressedenough.Often,inreal-worldapplicationstheanalystmaygetmesmerisedby thecomplexityorbeautyofthemethodbeingapplied,whilethedataitselfmaybe unreliableandleadtoresultswhichsuggestcoursesofactionwithoutasoundbasis.A goodstatistician/researcher(personalopinion)spends90%ofhis/hertimeon collectingandcleaningdata,anddevelopinghypothesiswhichcoverasmanyexternal explainablefactorsaspossible,andonly10%ontheactualmathematicalmanipulation ofthedataandderivingresults.
BenefitsandImportanceofDataCleaning
Datacleansingistheprocessofrecognizingmistakenorunethicaldatafroma database.Theprocessismainlyusedindatabaseswhereimproper,unfinished, inaccurateorirrelevantpartofthedataisidentifiedandthenaltered,replacedor deleted.Businessenterpriseslargelydependondatawhetheritisthehonestyof customers’addressesorensuringaccurateinvoicesareemailedorpostedtothe recipients.Toensurethatthecustomerdataisusedinthemostproductiveand meaningfulmannerthatcanincreasethefundamentalvalueofthebrand,business enterprisesmustgiveimportancetodataquality.
Managingandensuringthatthedataiscleancanprovidesubstantialgrowthtothe business.Businessenterprisescanfacelotsofhasslessuchashighcostinvolvedin processingerrors,manualtroubleshooting,incorrectinvoicedataandshipmentsto wrongaddress.Theinformationofthecustomerisforeverchangingduetorelocation orotherfactorswhichhavetobechangedandtheupdatedinformationmustreflectin thedatabase.Businessenterprisescanachieveawiderangeofbenefitsbycleansing datawhichcanleadtoloweringoperationalcostsandmaximizingprofits.
Herearethebenefitsofdatacleaning:
1.ImprovestheEfficiencyofCustomerAcquisitionActivities:
Businessenterprisescansignificantlyboosttheircustomeracquisitioneffortsby cleaningtheirdataasamoreefficientprospectslisthavingaccuratedatacanbe
created.Throughoutthemarketingprocess,businessenterprisesmustensurethatthe dataisclean,up-to-dateandaccuratebyregularlyfollowingdataqualityroutines.Multi- channelcustomerdatacanalsobemanagedseamlesslywhichprovidestheenterprise withanopportunitytocarryoutsuccessfulmarketingcampaignsinthefutureasthey wouldbeawareofthemethodstoeffectivelyreachouttotheirtargetaudience.
2.ImprovesDecisionMakingProcess:
Thekeystoneofeffectivedecisionmakinginabusinessenterpriseiscustomerdata. Preciseinformationanddataqualityareessentialtodecisionmaking.Datacleansing cansupportbetteranalyticsaswellasall-roundbusinessintelligencewhichcan facilitatebetterdecisionmakingandexecution.Intheend,havingaccuratedatacan helpbusinessenterprisesmakebetterdecisionswhichwillcontributetothesuccessof thebusinessinthelongrun.
3.StreamlinesBusinessPractices:
Datacleaningalongwiththerightanalyticscanalsohelptheenterprisetoidentifyan opportunitytolaunchanewproductorserviceinthemarketwhichtheconsumers mightlike,oritcanhighlightvariousmarketingavenuesthattheenterprisescantry.For example,ifamarketingcampaignisunsuccessful,thebusinessenterprisecanlookat variousothermarketingchannelsthathavethebestcustomerresponsedataand implementthem.
4.IncreasesProductivity:
Havingacleanandproperlymaintaineddatabasecanhelpbusinessenterprisesto ensurethattheemployeesaremakingthebestuseoftheirworkhours.Itcanalso preventthestaffoffromcontactingcustomerswithoutdatedinformationorcreate invalidvendorfilesinthesystembyhelpingthemtoworkwithcleanrecordsthereby maximizingthestaff’sefficiencyandproductivity.
AIOU Solved Assignments 2 Code 8614 Autumn & Spring 2023
——————-
Q.2Whatismeasureofdifference?Explaindifferenttypesoftestsindetailwithexamples.How arethesetestsusedinhypothesistesting?
InStatistics,deviationisameasureofthedifferencebetweentheobservedvalueofavariable andsomeothervalue,oftenbutnotnecessarilythatvariable’smean.
Forquantitativevariablesyoucanmeasurethisdeviationinseveralwaysincluding:
•Rangewhichisthedifferencebetweenthelargestandsmallestvaluesofthedataset.
•Standarddeviationdefinedasthesquarerootoftheaverageofthesquareddeviationsofthe valuesfromtheiraverage.
•CoefficientofvariationdefinedastheratioofthestandarddeviationtothemeanInpredictive modelssuchasalinearregression,deviationisdefinedasthedifferencebetweenthe measuredvalueandthepredictedone.
TypesofTests
T-Test
At-testisausefulstatisticaltechniqueusedforcomparingmeanvaluesoftwodatasets obtainedfromtwogroups.Thecomparisontellsuswhetherthesedatasetsaredifferentfrom eachother.Itfurthertellsushowsignificantthedifferencesareandifthesedifferencescould havehappenedbychance.Thestatisticalsignificanceoft-testindicateswhetherornotthe differencebetweenthemeanoftwogroupsmostlikelyreflectsarealdifferenceinthe populationfromwhichthegroupsareselected.
t-testsareusedwhentherearetwogroups(maleandfemale)ortwosetsofdata(beforeand after),andtheresearcherwishestocomparethemeanscoreonsomecontinuousvariable.
AnalysisofVariance(ANOVA)
Thet-testshaveoneveryseriouslimitation–theyarerestrictedtotestsofthesignificanceof thedifferencebetweenonlytwogroups.Therearemanytimeswhenweliketoseeifthereare significantdifferencesamongthree,four,orevenmoregroups.Forexamplewemaywantto investigatewhichofthreeteachingmethodsisbestforteachingninthclassalgebra.Insuch case,wecannotuset-testbecausemorethantwogroupsareinvolved.Todealwithsuchtype ofcasesoneofthemostusefultechniquesinstatisticsisanalysisofvariance(abbreviatedas ANOVA).ThistechniquewasdevelopedbyaBritishStatisticianRonaldA.Fisher(Dietz&Kalof, 2009;Bartz,1981)
AnalysisofVariance(ANOVA)isahypothesistestingprocedurethatisusedtoevaluatemean differencesbetweentwoormoretreatments(orpopulation).Likeallotherinferential procedures.ANOVAusessampledatatoasabasisfordrawinggeneralconclusionabout populations.Sometime,itmayappearthatANOVAandt-testaretwodifferentwaysofdoing exactlysamething:testingformeandifferences.Insomecasedthisistrue–bothtestsuse sampledatatotesthypothesisaboutpopulationmean.However,ANOVAhasmuchmore advantagesovert-test.t-testsareusedwhenwehavecompareonlytwogroupsorvariables (oneindependentandonedependent).OntheotherhandANOVAisusedwhenwehavetwoor morethantwoindependentvariables(treatment).Supposewewanttostudytheeffectsof threedifferentmodelsofteachingontheachievementofstudents.Inthiscasewehavethree differentsamplestobetreatedusingthreedifferenttreatments.SoANOVAisthesuitable techniquetoevaluatethedifference.
ChiSquareTest
TheChiSquareTestisatestthatinvolvestheuseofparameterstotestthestatistical significanceoftheobservationsunderstudy.
Thetaskofthechisquaretestistotestthestatisticalsignificanceoftheobservedrelationship withrespecttotheexpectedrelationship.Thechisquarestatisticisusedbytheresearcherfor determiningwhetherornotarelationshipexists.
Inthechisquaretest,thenullhypothesisisassumedastherenotbeinganassociationbetween thetwovariablesthatareobservedinthestudy.Thechisquaretestiscalculatedbyevaluating thecellfrequenciesthatinvolvetheexpectedfrequenciesinthosetypesofcaseswhenthereis noassociationbetweenthevariables.Thecomparisonbetweentheexpectedtypeoffrequency andtheactualobservedfrequencyisthenmadeinthistest.Thecomputationoftheexpected frequencysquaretestiscalculatedastheproductofthetotalnumberofobservationsinthe rowandthecolumn,whichisdividedbythetotalsizeofthesample.
Thecalculationofthestatisticinthechisquaretestisdonebycomputingthesumofthe squareofthedeviationbetweentheobservedandtheexpectedfrequency,whichisdividedby theexpectedfrequency.
Theresearchershouldknowthatthegreaterthedifferencebetweentheobservedandexpected cellfrequency,thelargerthevalueofthechisquarestatisticinthechisquaretest.
Inordertodetermineiftheassociationbetweenthetwovariablesexists,theprobabilityof obtainingavalueofchisquareshouldbelargerthantheoneobtainedfromthechisquaretest ofcrosstabulation.
Thereisonemorepopulartestcalledthechisquaretestforgoodnessoffit.
Thistypeoftestcalledthechisquaretestforgoodnessoffithelpstheresearcherto understandwhetherornotthesampledrawnfromacertainpopulationhasaspecific distributionandwhetherornotitactuallybelongstothatspecifieddistribution.Thistypeoftest canbeapplicabletoonlydiscretetypesofdistribution,likePoisson,binomial,etc.Thistypeof chisquaretestisanalternativetestforthenonparametrictestcalledtheKolmogorovSmrinov goodnessoffittest.
Thenullhypothesisassumedbytheresearcherinthistypeofchisquaretestisthatthedata drawnfromthepopulationfollowsthespecifieddistribution.Thechisquarestatisticinthistest isdefinedinasimilarmannertothedefinitionintheabovetypeoftest.Oneoftheimportant pointstobenotedbytheresearcheristhattheexpectednumberoffrequenciesinthistypeof chisquaretestshouldbeatleastfive.Thismeansthatthechisquaretestwillnotbevalidfor thosewhoseexpectedcellfrequencyislessthanfive.
Therearecertainassumptionsinthechisquaretest.
Therandomsamplingofdataisassumedinthechisquaretest.
Inthechisquaretest,asamplewithasufficientlylargesizeisassumed.Ifthechisquaretestis conductedonasamplewithasmallersize,thenthechisquaretestwillyieldinaccurate inferences.Theresearcher,byusingthechisquaretestonsmallsamples,mightendup committingaTypeIIerror.
Inthechisquaretest,theobservationsarealwaysassumedtobeindependentofeachother.
Inthechisquaretest,theobservationsmusthavethesamefundamentaldistribution.
AIOU Solved Assignments Code 8614 Autumn & Spring 2023
——————–
Q.3Explaintheconceptofreliability.Explaintypesofreliabilityandmethodsusedtocalculate eachtype.
Reliabilityisthedegreetowhichanassessmenttoolproducesstableandconsistent results.Reliabilityinstatisticsandpsychometricsistheoverallconsistencyofameasure.A measureissaidtohaveahighreliabilityifitproducessimilarresultsunderconsistent conditions.”Itisthecharacteristicofasetoftestscoresthatrelatestotheamountofrandom errorfromthemeasurementprocessthatmightbeembeddedinthescores.Scoresthatare highlyreliableareaccurate,reproducible,andconsistentfromonetestingoccasiontoanother. Thatis,ifthetestingprocesswererepeatedwithagroupoftesttakers,essentiallythesame resultswouldbeobtained.Variouskindsofreliabilitycoefficients,withvaluesrangingbetween 0.00(mucherror)and1.00(noerror),areusuallyusedtoindicatetheamountoferrorinthe scores.”Forexample,measurementsofpeople’sheightandweightareoftenextremelyreliable.
TypesofReliability
Test-retestreliabilityisameasureofreliabilityobtainedbyadministeringthesametesttwice overaperiodoftimetoagroupofindividuals.ThescoresfromTime1andTime2canthenbe correlatedinordertoevaluatethetestforstabilityovertime.
Example:Atestdesignedtoassessstudentlearninginpsychologycouldbegiventoagroupof studentstwice,withthesecondadministrationperhapscomingaweekafterthefirst.The obtainedcorrelationcoefficientwouldindicatethestabilityofthescores.
Parallelformsreliabilityisameasureofreliabilityobtainedbyadministeringdifferentversions ofanassessmenttool(bothversionsmustcontainitemsthatprobethesameconstruct,skill, knowledgebase,etc.)tothesamegroupofindividuals.Thescoresfromthetwoversionscan thenbecorrelatedinordertoevaluatetheconsistencyofresultsacrossalternateversions.
Example:Ifyouwantedtoevaluatethereliabilityofacriticalthinkingassessment,youmight createalargesetofitemsthatallpertaintocriticalthinkingandthenrandomlysplitthe questionsupintotwosets,whichwouldrepresenttheparallelforms.
Inter-raterreliabilityisameasureofreliabilityusedtoassessthedegreetowhichdifferent judgesorratersagreeintheirassessmentdecisions.Inter-raterreliabilityisusefulbecause humanobserverswillnotnecessarilyinterpretanswersthesameway;ratersmaydisagreeasto howwellcertainresponsesormaterialdemonstrateknowledgeoftheconstructorskillbeing assessed.
Example:Inter-raterreliabilitymightbeemployedwhendifferentjudgesareevaluatingthe degreetowhichartportfoliosmeetcertainstandards.Inter-raterreliabilityisespeciallyuseful whenjudgmentscanbeconsideredrelativelysubjective.Thus,theuseofthistypeofreliability wouldprobablybemorelikelywhenevaluatingartworkasopposedtomathproblems.
Internalconsistencyreliabilityisameasureofreliabilityusedtoevaluatethedegreetowhich differenttestitemsthatprobethesameconstructproducesimilarresults.
Averageinter-itemcorrelationisasubtypeofinternalconsistencyreliability.Itisobtainedby takingalloftheitemsonatestthatprobethesameconstruct(e.g.,readingcomprehension), determiningthecorrelationcoefficientforeachpairofitems,andfinallytakingtheaverageof allofthesecorrelationcoefficients.Thisfinalstepyieldstheaverageinter-itemcorrelation.
Split-halfreliabilityisanothersubtypeofinternalconsistencyreliability.Theprocessof obtainingsplit-halfreliabilityisbegunby“splittinginhalf”allitemsofatestthatareintendedto probethesameareaofknowledge(e.g.,WorldWarII)inordertoformtwo“sets”ofitems.The entiretestisadministeredtoagroupofindividuals,thetotalscoreforeach“set”iscomputed, andfinallythesplit-halfreliabilityisobtainedbydeterminingthecorrelationbetweenthetwo total“set”scores.
AIOU Solved Assignments 2 Autumn 2023 Code 8614
——————–
Q.4Whatiscorrelation?Howlevelofmeasurementhelpusinselectingcorrecttypeof correlation?Writecomprehensivenoteonrangeofcorrelationcoefficientandwhatdoesit explain?Canwepredictfuturecorrelationbycurrentrelationship?Ifyes,thenhow?
CORRELATION
Correlationisastatisticaltechniqueusedtomeasureanddescriberelationshipbetweentwo variables.Thesevariablesareneithermanipulatednorcontrolled,rathertheysimplyare observedastheynaturallyexistintheenvironment.Supposearesearcherisinterestedin
relationshipbetweennumberofchildreninafamilyandIQoftheindividualchild.Hewouldtake agroupofstudentscomingfromdifferentfamilies.Thenhesimplyobserveorrecordthe numberofchildreninafamilyandthenmeasureIQscoreofeachindividualstudentsamegroup. Hewillneithermanipulatenorcontrolanyvariable.Correlationrequirestwoseparatescoresfor eachindividual(onescorefromeachoftwovariables).Thesescoresarenormallyidentifiedas XandYandcanbepresentedinatableorinagraph.
Correlationisusedtotestrelationshipsbetweenquantitativevariablesorcategoricalvariables. Inotherwords,it’sameasureofhowthingsarerelated.Thestudyofhowvariablesare correlatediscalledcorrelationanalysis.
Someexamplesofdatathathaveahighcorrelation:
*Yourcaloricintakeandyourweight.
*Youreyecolorandyourrelatives’eyecolors.
*TheamountoftimeyourstudyandyourGPA.
Someexamplesofdatathathavealowcorrelation(ornoneatall):
*Yoursexualpreferenceandthetypeofcerealyoueat.
*Adog’snameandthetypeofdogbiscuittheyprefer.
*Thecostofacarwashandhowlongittakestobuyasodainsidethestation.
Correlationsareusefulbecauseifyoucanfindoutwhatrelationshipvariableshave,youcan makepredictionsaboutfuturebehavior.Knowingwhatthefutureholdsisveryimportantinthe socialscienceslikegovernmentandhealthcare.Businessesalsousethesestatisticsfor budgetsandbusinessplans.
CORRELATIONCOEFFICIENT
The”correlationcoefficient”wascoinedbyKarlPearsonin1896.Accordingly,thisstatisticis overacenturyold,andisstillgoingstrong.Itisoneofthemostusedstatisticstoday;secondto themean.Thecorrelationcoefficient’sweaknessesandwarningsofmisusearewell documented.Asafifteen-yearpracticedconsultingstatistician,whoalsoteachesstatisticians continuingandprofessionalstudiesfortheDatabaseMarketing/DataMiningIndustry,Iseetoo oftentheweaknessesandwarningsarenotheeded.Amongtheweaknesses/uses,thereisone thatisrarelymentioned:thecorrelationcoefficientinterval[-1,+1]isrestrictedbytheindividual distributionsofthetwovariablesbeingcorrelated.Thepurposeofthisarticleis:1)tointroduce theaffectsthedistributionsofthetwoindividualvariableshaveonthecorrelationcoefficient interval;and2)thusly,toprovideaprocedureforcalculatinganadjustedcorrelationcoefficient,
whoserealizedcorrelationcoefficientintervalisoftenshorterthantheoriginalone.
BasicsoftheCorrelationCoefficient
Thecorrelationcoefficient,denotedbyr,isameasureofthestrengthofthestraight-lineor linearrelationshipbetweentwovariables.Thewellknowncorrelationcoefficientisoften misusedbecauseitslinearityassumptionisnottested.Thecorrelationcoefficientcan–by definition,i.e.,theoretically–assumeanyvalueintheintervalbetween+1and-1,includingthe endvaluesplus/minus1.
Thefollowingpointsaretheacceptedguidelinesforinterpretingthecorrelationcoefficient:
0indicatesnolinearrelationship.
+1indicatesaperfectpositivelinearrelationship:asonevariableincreasesinitsvalues,the othervariablealsoincreasesinitsvaluesviaanexactlinearrule.
-1indicatesaperfectnegativelinearrelationship:asonevariableincreasesinitsvalues,the othervariabledecreasesinitsvaluesviaanexactlinearrule.
Prediction
Iftwovariablesareknowntoberelatedinsomesystematicway,itispossibletouseone variabletomakepredictionabouttheother.Forexample,whenastudentseeksadmissionina college,heisrequiredtosubmitagreatdealofpersonalinformation,includinghisscoresin SSCannual/supplementaryexamination.Thecollegeofficialswantthisinformationsothatthey canpredictthatstudent’schanceofsuccessincollege
AIOU Solved Assignments 2 Code 8614 Autumn & Spring 2023
———————-
Q.5Explainthefollowingtermswithexamples.
a) DegreeofFreedom
DegreesofFreedom
Theconceptofdegreesoffreedomiscentraltotheprincipleofestimatingstatisticsof populationsfromsamplesofthem.”Degreesoffreedom”iscommonlyabbreviatedtodf.
Thinkofdfasamathematicalrestrictionthatneedstobeputinplacewhenestimatingone statisticfromanestimateofanother.
Letustakeanexampleofdatathathavebeendrawnatrandomfromanormaldistribution. Normaldistributionsneedonlytwoparameters(meanandstandarddeviation)fortheir definition;e.g.thestandardnormaldistributionhasameanof0andstandarddeviation(sd)of1. Thepopulationvaluesofmeanandsdarereferredtoasmuandsigmarespectively,andthe sampleestimatesarex-barands.
Inordertoestimatesigma,wemustfirsthaveestimatedmu.Thus,muisreplacedbyx-barin theformulaforsigma.Inotherwords,weworkwiththedeviationsfrommuestimatedbythe deviationsfromx-bar.Atthispoint,weneedtoapplytherestrictionthatthedeviationsmust sumtozero.Thus,degreesoffreedomaren-1intheequationforsbelow:
Standarddeviationinapopulationis:

[xisavaluefromthepopulation,?isthemeanofallx,nisthenumberofxinthepopulation,? isthesummation]
Theestimateofpopulationstandarddeviationcalculatedfromarandomsampleis:

[xiistheithobservationfromasampleofthepopulation,x-baristhesamplemean,nisthe samplesize,?isthesummation]
Whenthisprincipleofrestrictionisappliedtoregressionandanalysisofvariance,thegeneral resultisthatyouloseonedegreeoffreedomforeachparameterestimatedpriortoestimating the(residual)standarddeviation.
Anotherwayofthinkingabouttherestrictionprinciplebehinddegreesoffreedomistoimagine contingencies.Forexample,imagineyouhavefournumbers(a,b,candd)thatmustadduptoa totalofm;youarefreetochoosethefirstthreenumbersatrandom,butthefourthmustbe chosensothatitmakesthetotalequaltom-thusyourdegreeoffreedomisthree.
b) SpreadofScores
Measuresofspreaddescribehowsimilarorvariedthesetofobservedvaluesarefora particularvariable(dataitem).Measuresofspreadincludetherange,quartilesandthe interquartilerange,varianceandstandarddeviation.
Whencanwemeasurespread?
Thespreadofthevaluescanbemeasuredforquantitativedata,asthevariablesarenumeric andcanbearrangedintoalogicalorderwithalowendvalueandahighendvalue.
Whydowemeasurespread?
Summarisingthedatasetcanhelpusunderstandthedata,especiallywhenthedatasetislarge. AsdiscussedintheMeasuresofCentralTendencypage,themode,median,andmean summarisethedataintoasinglevaluethatistypicalorrepresentativeofallthevaluesinthe dataset,butthisisonlypartofthe’picture’thatsummarisesadataset.Measuresofspread summarisethedatainawaythatshowshowscatteredthevaluesareandhowmuchtheydiffer fromthemeanvalue.

c) Sample
Instatistics,you’llbeworkingwithsamples.Asampleisjustapartofapopulation.Forexample, ifyouwanttofindouthowmuchtheaverageAmericanearns,youaren’tgoingtowantto surveyeveryoneinthepopulation(over300millionpeople),soyouwouldchooseasmall numberofpeopleinthepopulation.Forexample,youmightselect10,000people.
FindingaSample
Technically,youcan’tjustchoose10,000people.Inorderforittobestatistical(i.e.onethatyou canuseinstatistics),theactualsizemustbefoundusingastatisticalmethod.Tenthousand peoplemightnotbetheoptimalamountforvalidsurveyresults:youmayneedmore,orless. Therearemany,manywaystofindsamplesizes,includingusingdatafrompriorexperimentsor usingasizecalculator.Howyoufindasamplesizecanbequitecomplex,dependingonwhat youwanttodowithyourdata.Youcanfindoutmoreabouthowtofindthemhere:Samplesize: Howtofindit.
Methods
Ifyou’vedecidedtoassembleyoursamplefromscratch(forexample,youaren’tusingprior data),thenyouneedtochooseasamplingmethod.Whichsamplingmethodyouusedepends onwhatresourcesandinformationyouhaveavailable.Forexample,thedraftworkedby drawingrandombirthdates,amethodcalledsimplerandomsampling.Inorderforthattowork, thegovernmentneededalistofeverypotentialdraftee’snameanddateofbirth.Thedraftcould alsohaveusedsystematicsampling,drawingthenthnamefromalist(forexample,every100th name).Forthattohaveworked,allthenamesmustfirsthavebeencompiledonalist
d) ConfidenceInterval
Whatareconfidenceintervals?
Howdoweformaconfidenceinterval?
Thepurposeoftakingarandomsamplefromalotorpopulationandcomputingastatistic,such asthemeanfromthedata,istoapproximatethemeanofthepopulation.Howwellthesample statisticestimatestheunderlyingpopulationvalueisalwaysanissue.Aconfidenceinterval addressesthisissuebecauseitprovidesarangeofvalueswhichislikelytocontainthe populationparameterofinterest.
Confidencelevels
Confidenceintervalsareconstructedataconfidencelevel,suchas95%,selectedbytheuser. Whatdoesthismean?Itmeansthatifthesamepopulationissampledonnumerousoccasions andintervalestimatesaremadeoneachoccasion,theresultingintervalswouldbracketthetrue
populationparameterinapproximately95%ofthecases.Aconfidencestatedata1??level canbethoughtofastheinverseofasignificancelevel,?.
Oneandtwo-sidedconfidenceintervals
Inthesamewaythatstatisticaltestscanbeoneortwo-sided,confidenceintervalscanbeone ortwo-sided.Atwo-sidedconfidenceintervalbracketsthepopulationparameterfromabove andbelow.Aone-sidedconfidenceintervalbracketsthepopulationparametereitherfrom aboveorbelowandfurnishesanupperorlowerboundtoitsmagnitude.
Exampleofatwo-sidedconfidenceinterval
Forexample,a100(1??)%confidenceintervalforthemeanofanormalpopulationis
whereY ?isthesamplemean,z1??/2isthe1??/2criticalvalueofthestandardnormal distributionwhichisfoundinthetableofthestandardnormaldistribution,?istheknown populationstandarddeviation,andNisthesamplesize.
e) ZScore
z-score
Az-score(aka,astandardscore)indicateshowmanystandarddeviationsanelementisfrom themean.Az-scorecanbecalculatedfromthefollowingformula.
z=(X-?)/?
wherezisthez-score,Xisthevalueoftheelement,?isthepopulationmean,and?isthe standarddeviation.
Hereishowtointerpretz-scores.
Az-scorelessthan0representsanelementlessthanthemean.
Az-scoregreaterthan0representsanelementgreaterthanthemean.
Az-scoreequalto0representsanelementequaltothemean.
Az-scoreequalto1representsanelementthatis1standarddeviationgreaterthanthemean;a z-scoreequalto2,2standarddeviationsgreaterthanthemean;etc.
Az-scoreequalto-1representsanelementthatis1standarddeviationlessthanthemean;az- scoreequalto-2,2standarddeviationslessthanthemean;etc.
Ifthenumberofelementsinthesetislarge,about68%oftheelementshaveaz-scorebetween -1and1;about95%haveaz-scorebetween-2and2;andabout99%haveaz-scorebetween-3 and3.
Example
Samplequestion:YoutaketheSATandscore1100.ThemeanscorefortheSATis1026and thestandarddeviationis209.Howwelldidyouscoreonthetestcomparedtotheaveragetest taker?
Step1:WriteyourX-valueintothez-scoreequation.ForthissamplequestiontheX-valueisyour SATscore,1100.
Z=1100-?/?
Step2:Putthemean,?,intothez-scoreequation.
Z=1100-1026/?
Step3:Writethestandarddeviation,?intothez-scoreequation.
Z=1100-1026/209
Step4:Calculatetheanswerusingacalculator:
(1100–1026)/209=.354.Thismeansthatyourscorewas.354stddevsabovethemean.
Step5:(Optional)Lookupyourz-valueinthez-tabletoseewhatpercentageoftest-takers scoredbelowyou.Az-scoreof.354is.1368+.5000*=.6368or63.68%.
AIOU Solved Assignments 2 Autumn & Spring 2023 Code 8614
———————