{"id":11742,"date":"2024-09-25T15:11:40","date_gmt":"2024-09-25T14:11:40","guid":{"rendered":"https:\/\/www.blopig.com\/blog\/?p=11742"},"modified":"2024-09-27T13:38:36","modified_gmt":"2024-09-27T12:38:36","slug":"the-patterns-that-escape-us","status":"publish","type":"post","link":"https:\/\/www.blopig.com\/blog\/2024\/09\/the-patterns-that-escape-us\/","title":{"rendered":"The Patterns that Escape Us"},"content":{"rendered":"\n<div class=\"wp-block-jetpack-markdown\"><h4>Part The First: An Outrageous Claim<\/h4>\n<p>Reproduced below is the introductory passage from a psycholinguistics paper, published in the mid-nineties.\nRiveted, as I\u2019m sure you are, having just read that banger opening line to my blog post, humour me and read on; I promise it gets interesting.<\/p>\n<\/div>\n\n\n\n<!--more-->\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<div class=\"wp-block-jetpack-markdown\"><blockquote>\n<p><strong>The segmentation problem<\/strong><\/p>\n<p>The orthography of English has a very simple basis for establishing where words in written texts begin and end: both before and also after every word are empty spaces and this demarcation surely helps the reader comprehend. In a spoken text, however, as presented to a hearer, such explicit segmentation cues are rarely to be found; little pauses after every single word might make things clearer, but the input is continuous &#8211; a running stream of sound. This implies that part of listening involves an operation whereby input is segmented, to be processed word by word, for we cannot hold in memory each total collocation, as most sentences we come across are previously unheard. Yet we listeners experience no sense of some dramatic act of separating input into pieces that are known; as we listen to an utterance it seems unproblematic &#8211; words in sentences seem just as clear as words that stand alone. Just how listeners accomplish such an effortless division is a question that psychologists have now begun to solve, and this paper will describe (although with minimal precision) some experimental studies showing what it might involve. The findings, as this summary explains, at once can vindicate the order of the problem and the hearer\u2019s sense of ease, for though speech must be segmented, yet the data plainly indicate that rhythm in the input makes segmenting speech a breeze.<\/p>\n<\/blockquote>\n<p>Notice any stylistic choices that might be considered odd for a research paper?\nAny particular feature of the word usage that jumps out and you?\nDoes the text seem strangely hard to read?\n(I promise this isn\u2019t a post about \u2018proper writing style\u2019 \u2013 I\u2019d hardly be qualified.)<\/p>\n<p>OK, how about we try some re-formatting?<\/p>\n<\/div>\n\n\n\n<pre class=\"wp-block-verse has-text-align-center has-dark-gray-color has-text-color has-link-color has-small-font-size wp-elements-63d48be98b6d6c2b57aebea7744c3cab\"><em><strong>The segmentation problem<\/strong><br><br> The orthography of English has a very simple basis <br>for establishing where words in written texts begin and end: <br> both before and also after every word are empty spaces <br> and this demarcation surely helps the reader comprehend.<br><br>In a spoken text, however, as presented to a hearer, <br>such explicit segmentation cues are rarely to be found; <br>little pauses after every single word might make things clearer, <br>but the input is continuous - a running stream of sound. <br><br>This implies that part of listening involves an operation <br>whereby input is segmented, to be processed word by word, <br>for we cannot hold in memory each total collocation,<br>as most sentences we come across are previously unheard.<br><br>Yet we listeners experience no sense of some dramatic <br>act of separating input into pieces that are known; <br>as we listen to an utterance it seems unproblematic - <br>words in sentences seem just as clear as words that stand alone. <br><br>Just how listeners accomplish such an effortless division <br>is a question that psychologists have now begun to solve,  <br>and this paper will describe (although with minimal precision) <br>some experimental studies showing what it might involve. <br><br>The findings, as this summary explains, at once can vindicate <br>the order of the problem and the hearer's sense of ease, <br>for though speech must be segmented, yet the data plainly indicate <br>that rhythm in the input makes segmenting speech a breeze.<\/em><\/pre>\n\n\n\n<div class=\"wp-block-jetpack-markdown\"><p>If it isn\u2019t clear yet, this is perfectly metered verse with rhyming lines; a poem with a very regular rhythm, typeset as prose.\nOnce this rhythm was pointed out to me (or read out, rather), it was impossible to unsee.\nI can <em>hear<\/em> it in my head, when I re-read the original text, it jumps out at me.\nYet, first time round, I completely failed to notice it, despite the stark, obvious, unfailing regularity.\nHow did I miss such an obvious pattern?\nHow could anyone?<\/p>\n<p>The passage is from a 1994 <em>Cognition<\/em> paper by Anne Cutler, entitled <a href=\"https:\/\/repository.ubn.ru.nl\/bitstream\/handle\/2066\/15628\/6033.pdf\">\u2018The perception of rhythm in language\u2019<\/a>.\nYes, the entire three-page thing is written in verse (and is\u2013in my opinion\u2013a delight to read, once you pick up on the rhythm).\nNo, the vast majority of its readers will not have noticed this fact until the text impishly hints at it in the very last paragraph.<\/p>\n<p>Content-wise, the thrust of the paper is as follows:<\/p>\n<ol>\n<li>To understand speech, human brains must somehow chop up continuous audio-stream input into discrete tokens (words) with defined meanings \u2013 otherwise they would have to somehow store a giant hash-table mapping every possible sequence of language sounds to its intended meaning, which seems implausible.\n<ul>\n<li>side-note: This should be quite intuitive, if you\u2019ve ever studied a foreign language: Because. Native. Speakers. Don\u2019t. Speak. With. Pauses. Like. This, pinpointing where exactly in their speech one unknown word ends and where another one begins is among the hardest challenges, when first starting out.<\/li>\n<li>second side-note:  NLP folks face the same problem, when building speech-recognition models. I don\u2019t know much about how they go about solving it, but one assumes there are parallels there.<\/li>\n<\/ul>\n<\/li>\n<li>The paper contends that human brains solve this segmentation problem by attending closely to the rhythm in the audio stream. The exact rhythmic patterns they latch onto differ between languages, but in all cases the rhythm helps define syllable- and word-boundaries, which then allow the stream to be properly chopped up and processed.<\/li>\n<li>However, when we <em>read<\/em> a text, the problem doesn\u2019t arise, because words are separated from each other by spaces \u2013 it is already clear how the input should be segmented into semantic units or tokens. So people cease to attend to the rhythm of the language (presumably to save cognitive labour), unless the text is typeset to make rhymes and rhythm obvious.<\/li>\n<\/ol>\n<p>Now, I don\u2019t know whether humans actually solve the segmentation problem in this manner.\nI imagine linguists, cognitive scientists and NLP people will have a much more informed opinion on this, than I \u2013 in fact, I\u2019m reliably informed this notion is quite outdated.\nThat is not the point of this blog post.<\/p>\n<p>What I took exception to is the third point only \u2013 because I just didn\u2019t believe it.\nSurely, when the rhythm is so blindingly obvious when pointed out, <em>most<\/em> people will eventually notice on their own, unprompted?\nWhat if they\u2019re asked to read it out loud, transforming the text back into speech?\nWhat if they do so fluently and proficiently, stressing the words in exactly the right pattern, and can <em>hear themselves doing it<\/em> \u2013 surely then they must notice?\nMaybe non-native speakers have an advantage, being perhaps less comfortable with the language and relying more on subvocalising (\u2018reading it out loud in their heads\u2019).\nOr perhaps natives are more likely to pick up the pattern, fully at home in their language and confronted here with a very regularly patterned, high signal-to-noise ratio anomaly.\nSurely, <em>surely<\/em> someone must notice?<\/p>\n<h4>Part The Second: The Survey<\/h4>\n<p>I\u2019m nothing if not an empiricist, so I conducted a survey among the OPIGlets, none of whom had previously encountered the text.\n(N=15, though, so take this with a grain of salt.)\nParticipants self-identified as either native or non-native speakers of English, and as either speakers or non-speakers of another (non-English) language.\nThey were then invited to read the above passage (presented as \u2018an excerpt from a research article\u2019) either silently or out loud, and to comment on any stylistic peculiarities they might have noticed.<\/p>\n<p>The sample composition was as follows:<\/p>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th style=\"text-align:center\">natives<\/th>\n<th style=\"text-align:center\">non-natives<\/th>\n<th>TOTAL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>speak other language<\/td>\n<td style=\"text-align:center\">7<\/td>\n<td style=\"text-align:center\">4<\/td>\n<td>11<\/td>\n<\/tr>\n<tr>\n<td>do not speak other language<\/td>\n<td style=\"text-align:center\">4<\/td>\n<td style=\"text-align:center\">0<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td>TOTAL<\/td>\n<td style=\"text-align:center\">11<\/td>\n<td style=\"text-align:center\">4<\/td>\n<td>15<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Participants were allocated to treatment groups approximately evenly, stratified by native language,<\/p>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th style=\"text-align:center\">natives<\/th>\n<th style=\"text-align:center\">non-natives<\/th>\n<th>TOTAL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>asked to read silently<\/td>\n<td style=\"text-align:center\">6<\/td>\n<td style=\"text-align:center\">2<\/td>\n<td>8<\/td>\n<\/tr>\n<tr>\n<td>asked to read out loud<\/td>\n<td style=\"text-align:center\">5<\/td>\n<td style=\"text-align:center\">2<\/td>\n<td>7<\/td>\n<\/tr>\n<tr>\n<td>TOTAL<\/td>\n<td style=\"text-align:center\">11<\/td>\n<td style=\"text-align:center\">4<\/td>\n<td>15<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>but without regard for whether they spoke a non-English language:<\/p>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th style=\"text-align:center\">speak other languages<\/th>\n<th style=\"text-align:center\">don\u2019t speak other languages<\/th>\n<th>TOTAL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>asked to read silently<\/td>\n<td style=\"text-align:center\">5<\/td>\n<td style=\"text-align:center\">3<\/td>\n<td>8<\/td>\n<\/tr>\n<tr>\n<td>asked to read out loud<\/td>\n<td style=\"text-align:center\">6<\/td>\n<td style=\"text-align:center\">1<\/td>\n<td>7<\/td>\n<\/tr>\n<tr>\n<td>TOTAL<\/td>\n<td style=\"text-align:center\">11<\/td>\n<td style=\"text-align:center\">4<\/td>\n<td>15<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h4>Part the Third: The Results<\/h4>\n<p>First, of all, I was shocked \u2013<em>shocked<\/em>\u2013 to find that it really <em>is<\/em> rare for people to pick up on this.\nIt wasn\u2019t just me.\nOnly one person (6.7% of participants) could specifically point to the fact that the text had a well-defined rhythm which was the cause of the unusual structure.\nThat person was a non-native speaker, asked to read the text out loud.\nA native speaker (read silently, spoke no foreign languages) identified individual rhymes, but no overall rhythmic pattern, even when prompted.\nAnother native speaker (read out loud, spoke a foreign language) commented on the use of uncharacteristically poetic language for a research article \u2013noting a less clinical, emotionally evocative style\u2013 but did not specifically identify rhyme or metre as the underlying cause, even when prompted.\nEverybody else missed it entirely.\nThis includes one person who read it out so fluently and in such perfect prosody that any listener would have immediately identified their performance as the recitation of a poem; but no, the reader themselves did not notice.<\/p>\n<p>One person (native, reading silently, spoke a foreign language) offered that the text might be a word-for-word translation of some kind; foreign syntax imposed on English, impeding \u2018flow\u2019 in strange ways.\nA lack of flow was noted by almost all participants (which is ironic, given what is actually going on \u2013 I\u2019d argue the text flows rather well).\nMany commented on the unusual sentence length and perceived the text as (overly) verbose, oddly punctuated, and hard to read.<\/p>\n<p>There\u2019s of course limited insight one can get from a sample this small, but overinterpreting results in fun \u2013 so I went ahead and checked for pairwise associations anyway.\nIn order of significance:<\/p>\n<p><strong>Speaks Foreign Language vs. Noticing Rhythm: NOT SIGNIFICANT<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th style=\"text-align:center\">noticed rhythm<\/th>\n<th style=\"text-align:center\">did not notice rhythm<\/th>\n<th>TOTAL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>speaks foreign language<\/td>\n<td style=\"text-align:center\">1<\/td>\n<td style=\"text-align:center\">10<\/td>\n<td>11<\/td>\n<\/tr>\n<tr>\n<td>doesn\u2019t speak foreign language<\/td>\n<td style=\"text-align:center\">0<\/td>\n<td style=\"text-align:center\">4<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td>TOTAL<\/td>\n<td style=\"text-align:center\">1<\/td>\n<td style=\"text-align:center\">14<\/td>\n<td>15<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import scipy.stats as stats\ncontingency_table = [[1, 10], [0, 4]]\nodds_ratio, p_value = stats.fisher_exact(contingency_table)\nprint(odds_ratio, p_value)\n\n&gt; (inf, 1.0)<\/pre>\n\n\n\n<div class=\"wp-block-jetpack-markdown\"><p><strong>Reading Mode vs. Noticing Rhythm: VERY PROBABLY NOT SIGNIFICANT<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th style=\"text-align:center\">noticed rhythm<\/th>\n<th style=\"text-align:center\">did not notice rhythm<\/th>\n<th>TOTAL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>asked to read silently<\/td>\n<td style=\"text-align:center\">0<\/td>\n<td style=\"text-align:center\">8<\/td>\n<td>8<\/td>\n<\/tr>\n<tr>\n<td>asked to read out loud<\/td>\n<td style=\"text-align:center\">1<\/td>\n<td style=\"text-align:center\">6<\/td>\n<td>7<\/td>\n<\/tr>\n<tr>\n<td>TOTAL<\/td>\n<td style=\"text-align:center\">1<\/td>\n<td style=\"text-align:center\">14<\/td>\n<td>15<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<pre><code class=\"language-python\">&gt; (0.0, 0.4666666666666667)\n<\/code><\/pre>\n<p><strong>Native Speaker vs. Noticing Rhythm: PROBABLY NOT SIGNIFICANT?<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<th><\/th>\n<th style=\"text-align:center\">noticed rhythm<\/th>\n<th style=\"text-align:center\">did not notice rhythm<\/th>\n<th>TOTAL<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>native<\/td>\n<td style=\"text-align:center\">0<\/td>\n<td style=\"text-align:center\">11<\/td>\n<td>11<\/td>\n<\/tr>\n<tr>\n<td>non-native<\/td>\n<td style=\"text-align:center\">1<\/td>\n<td style=\"text-align:center\">3<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td>TOTAL<\/td>\n<td style=\"text-align:center\">1<\/td>\n<td style=\"text-align:center\">14<\/td>\n<td>15<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<pre><code class=\"language-python\">&gt;(0.0, 0.26666666666666666)\n<\/code><\/pre>\n<p>I dunno, maybe this last one is something to look into with a larger sample\u2026 \ud83d\ude09<\/p>\n<h4>Part The Fourth: Get to The Point Already<\/h4>\n<p>Without, of course, taking my <em>very professional, highly scientific, perfectly engineered<\/em> study too seriously, I find it fascinating that most people, including myself, seem to consistently miss this pattern, even though it seems blindingly obvious in hindsight (at least to me).\nFor what it\u2019s worth, Claude (<code>sonnet-3.5<\/code>) also failed to identify the rhythm, though, to be fair, its architecture is not really laid out to notice things in pronunciations it has never heard.\nI\u2019m half-certain someone out there has done this study properly and I\u2019d be really interested to know if non-natives tend to notice more or less frequently, for example.\nI also highly recommend <a href=\"https:\/\/repository.ubn.ru.nl\/bitstream\/handle\/2066\/15628\/6033.pdf\">the original paper<\/a> &#8211; it\u2019s just a joy to read out loud, and quite short, too.<\/p>\n<p>More importantly, however, there\u2019s a point I want to make about the patterns right in front of us we simply fail to see.\nA simple change in your perspective\u2019s often all that\u2019s really needed for the pieces to fall into place \u2013 whatever they may be.<\/p>\n<p>Oh, look, that last bit rhymed.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":125,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","wikipediapreview_detectlinks":true,"_monsterinsights_skip_tracking":false,"ngg_post_thumbnail":0,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[46],"tags":[794,795],"ppma_author":[783],"class_list":["post-11742","post","type-post","status-publish","format-standard","hentry","category-humour","tag-linguistics","tag-survey"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"authors":[{"term_id":783,"user_id":125,"is_guest":0,"slug":"ody","display_name":"Odysseas Vavourakis","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/b74030bdaef5f39ec32be3ae7bb5af054cbcb0b431b1cc51ba1b41d723ecee48?s=96&d=mm&r=g","author_category":"","user_url":"","last_name":"Vavourakis","first_name":"Odysseas","job_title":"","description":""}],"_links":{"self":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/11742","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/users\/125"}],"replies":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/comments?post=11742"}],"version-history":[{"count":4,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/11742\/revisions"}],"predecessor-version":[{"id":11760,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/11742\/revisions\/11760"}],"wp:attachment":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media?parent=11742"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/categories?post=11742"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/tags?post=11742"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=11742"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}