{"id":896,"date":"2018-05-08T00:15:22","date_gmt":"2018-05-07T21:15:22","guid":{"rendered":"http:\/\/esanu.name\/vitalie\/?p=896"},"modified":"2018-05-08T00:15:22","modified_gmt":"2018-05-07T21:15:22","slug":"despre-sondaje-si-cum-ele-sunt-gresit-facute-si-pe-la-casele-mari","status":"publish","type":"post","link":"http:\/\/esanu.name\/vitalie\/?p=896","title":{"rendered":"Despre sondaje \u0219i cum ele sunt gre\u0219it f\u0103cute \u0219i pe la casele mari."},"content":{"rendered":"<p>\t\t\t\t<a style=\"font-size: 1rem; color: #0f3647;\" href=\"http:\/\/esanu.name\/vitalie\/wp-content\/uploads\/2018\/05\/bell-curve.gif\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-897\" alt=\"bell curve\" src=\"http:\/\/esanu.name\/vitalie\/wp-content\/uploads\/2018\/05\/bell-curve.gif\" width=\"591\" height=\"303\" \/><\/a><\/p>\n<p>C\u00e2nd fac un sondaj, organizatorii \u00eencearc\u0103 s\u0103 aleag\u0103 e\u0219antionul c\u00e2t mai reprezentativ pentru ca sondajul s\u0103 fie c\u00e2t mai corect. \u00a0Dup\u0103 ce se colecteaz\u0103 un num\u0103r relativ mare de responde\u021bi, se declar\u0103 <a href=\"https:\/\/en.wikipedia.org\/wiki\/Standard_deviation\">marja de eroare<\/a> \u0219i se anun\u021b\u0103 \u00een pres\u0103 rezultatele sondajului. Aten\u021bie \u00eens\u0103, marja de eroare este de fapt o eroare tehnic\u0103 (adic\u0103 c\u00e2te % din oameni au gre\u0219it indica\u021biile). Aceasta nu e o eroare de la purul adev\u0103r care sondajul trebuie s\u0103-l g\u0103seasc\u0103. Recomandarea mea este ca aceast\u0103 eroare s\u0103 nu fie inclus\u0103 \u00een slide-uri pentru a nu induce \u00een eroare jurnali\u0219tii.<\/p>\n<p>P\u0103rerea mea subiectiv\u0103 este c\u0103 toate sondajele realizate p\u00e2n\u0103 acum sunt <strong>f\u0103cute\u00a0<\/strong><strong>gre\u0219it<\/strong>.\u00a0V\u0103 explic \u0219i de ce.<\/p>\n<p>Problema este \u00een alegerea e\u0219antionului reprezentativ.<\/p>\n<p>Noi, oamenii, c\u00e2nd \u00eencerc\u0103m s\u0103 facem e\u0219antionul ne str\u0103duim s\u0103 invent\u0103m c\u00e2teva categorii de persoane, apoi vedem care este distribu\u021bia lor \u00een societate \u0219i repectiv extrapol\u0103m \u00een e\u0219antion.<\/p>\n<p>De exemplu: E\u0219antionul e de 1000, iar studen\u021bi \u00een \u021bar\u0103 sunt 5% \u00eenseamn\u0103 c\u0103 \u00een e\u0219antion trebuie s\u0103 avem exact 50 studen\u021bi, nici mai mult nici mai pu\u021bin.<\/p>\n<p>Aceste categorii \u00een domeniul <a href=\"https:\/\/en.wikipedia.org\/wiki\/Machine_learning\">Machine Learning<\/a> se mai numesc <a href=\"https:\/\/en.wikipedia.org\/wiki\/Feature_(machine_learning)\">features<\/a>. \u00a0Noi, oamenii, \u00eencerc\u0103m s\u0103 g\u00e2ndim ce categorii de oameni poate s\u0103 voteze diferit de o alt\u0103 categorie \u0219i s\u0103 le includem \u00een sondaj pe ambele; Machine Learning ia \u00een vedere orice bif\u0103 care indic\u0103 fiecare om \u0219i \u00eencearc\u0103 s\u0103 \u0219i calculeze care anume din aceste bife influen\u021beaz\u0103 deciziile omului.<\/p>\n<p>Extragerea features sau embed-urilor nu este o noutate \u00een domeniul deep learning. Mai \u021bine\u021bi minte cum Google Translate \u0219i-a inventat <a href=\"http:\/\/www.wired.co.uk\/article\/google-ai-language-create\">propriul\u00a0limbaj de comunicare<\/a>? Sau cum <a href=\"https:\/\/diacritice.ai\">diacritice.ai<\/a> plaseaz\u0103 corect diacritice f\u0103r\u0103 s\u0103 \u0219tie m\u0103car regulele limbii rom\u00e2ne?<\/p>\n<p>Deci cum ar trebui s\u0103 arate un sondaj f\u0103cut\u00a0corect?<\/p>\n<ol>\n<li>Trebuie s\u0103 colect\u0103m c\u00e2t mai mult\u0103 informa\u021bie despre persoanele care au votat \u00een trecut \u0219i cum au votat ei.<\/li>\n<li>Crearea unor <a href=\"https:\/\/en.wikipedia.org\/wiki\/Embedding\">embed-uri<\/a> \u00een baza acestor date.<\/li>\n<li>Crearea unui sondaj \u0219i colectarea a c\u00e2t mai multor responden\u021bi.<\/li>\n<li>Eliminarea reponden\u021bilor care nu sunt relevan\u021bi conform propor\u021biei de embeduri.<\/li>\n<li>Publicarea datelor statistice de la responden\u021bii r\u0103ma\u0219i.<\/li>\n<\/ol>\n<p>\u0218tiu, ve\u021bi spune c\u0103 niciodat\u0103 nimeni nu va declara cu cine a votat \u0219i cu at\u00e2t mai mult o mul\u021bime de bife care indic\u0103 totul despre el.\u00a0Sunt de acord cu voi.<\/p>\n<p>Ne r\u0103m\u00e2ne p\u00e2n\u0103 atunci s\u0103 ne juc\u0103m de-a pseudo-sondajele.\t\t<\/p>\n","protected":false},"excerpt":{"rendered":"<p>C\u00e2nd fac un sondaj, organizatorii \u00eencearc\u0103 s\u0103 aleag\u0103 e\u0219antionul c\u00e2t mai reprezentativ pentru ca sondajul s\u0103 fie c\u00e2t mai corect. \u00a0Dup\u0103 ce se colecteaz\u0103 un num\u0103r relativ mare de responde\u021bi, se declar\u0103 marja de eroare \u0219i se anun\u021b\u0103 \u00een pres\u0103 rezultatele sondajului. Aten\u021bie \u00eens\u0103, marja de eroare este de fapt o eroare tehnic\u0103 (adic\u0103 c\u00e2te &#8230; <a title=\"Despre sondaje \u0219i cum ele sunt gre\u0219it f\u0103cute \u0219i pe la casele mari.\" class=\"read-more\" href=\"http:\/\/esanu.name\/vitalie\/?p=896\" aria-label=\"More on Despre sondaje \u0219i cum ele sunt gre\u0219it f\u0103cute \u0219i pe la casele mari.\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-896","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"http:\/\/esanu.name\/vitalie\/index.php?rest_route=\/wp\/v2\/posts\/896","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/esanu.name\/vitalie\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/esanu.name\/vitalie\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/esanu.name\/vitalie\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/esanu.name\/vitalie\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=896"}],"version-history":[{"count":0,"href":"http:\/\/esanu.name\/vitalie\/index.php?rest_route=\/wp\/v2\/posts\/896\/revisions"}],"wp:attachment":[{"href":"http:\/\/esanu.name\/vitalie\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=896"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/esanu.name\/vitalie\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=896"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/esanu.name\/vitalie\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=896"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}