The Internet and Semantic Structure: A Re-Examination

I discuss an earlier article arguing that Google counts for paired concepts reflect the similarity of concepts and that analysis of sets of paired queries reflect semantic structure—in the sense used earlier by Osgood. I argued there that such structure could be useful as a source of ideas about att...

Full description

Bibliographic Details
Main Author: Jack B. Arnold
Format: Article
Language:English
Published: SAGE Publishing 2011-04-01
Series:Methodological Innovations
Online Access:https://doi.org/10.4256/mio.2010.0031
_version_ 1818454630445613056
author Jack B. Arnold
author_facet Jack B. Arnold
author_sort Jack B. Arnold
collection DOAJ
description I discuss an earlier article arguing that Google counts for paired concepts reflect the similarity of concepts and that analysis of sets of paired queries reflect semantic structure—in the sense used earlier by Osgood. I argued there that such structure could be useful as a source of ideas about attitudes and pilot data for survey research. I note that recent observations indicate that sometimes some page count sets are not logically consistent and not useful for such research. Here, I examine variables associated with one type of logical error: conjunction errors, where counts for pairs of concepts are greater than counts for one or both of the individual concepts. I review Osgood's idea of semantic space and the logic of similarity related to search engine counts. I then present examples of two count matrices—one usable and the other with a number of conjunction errors. Next, I use the “usable” matrix as input for multidimensional scaling, and the output of that as an example of the semantic order to be found in usable data. Finally, I examine page counts from pairs of concepts from fifteen concept sets. I used three search engines: Google, Yahoo and Bing. Google and Bing returned many errors. Yahoo returned none. Google and Bing show a statistically significant relationship between conjunction errors and the complexity of the concept names. Queries with two or more terms in the names produced fewer errors than ones with single term names. More complex queries, also, tended to produce smaller page counts, and smaller counts are also associated with fewer conjunction errors. However, these data are consistent with the hypothesis that complexity has an effect on error that is independent of count size. I discuss how conjunction errors can be avoided and suggest several ways to avoid them. I conclude that page counts can be used with some confidence in the social/behavioral sciences, and discuss promising avenues for further research.
first_indexed 2024-12-14T21:57:56Z
format Article
id doaj.art-7a91ce148ce4482eb8a08cd89d989085
institution Directory Open Access Journal
issn 2059-7991
language English
last_indexed 2024-12-14T21:57:56Z
publishDate 2011-04-01
publisher SAGE Publishing
record_format Article
series Methodological Innovations
spelling doaj.art-7a91ce148ce4482eb8a08cd89d9890852022-12-21T22:46:05ZengSAGE PublishingMethodological Innovations2059-79912011-04-01610.4256/mio.2010.0031The Internet and Semantic Structure: A Re-ExaminationJack B. ArnoldI discuss an earlier article arguing that Google counts for paired concepts reflect the similarity of concepts and that analysis of sets of paired queries reflect semantic structure—in the sense used earlier by Osgood. I argued there that such structure could be useful as a source of ideas about attitudes and pilot data for survey research. I note that recent observations indicate that sometimes some page count sets are not logically consistent and not useful for such research. Here, I examine variables associated with one type of logical error: conjunction errors, where counts for pairs of concepts are greater than counts for one or both of the individual concepts. I review Osgood's idea of semantic space and the logic of similarity related to search engine counts. I then present examples of two count matrices—one usable and the other with a number of conjunction errors. Next, I use the “usable” matrix as input for multidimensional scaling, and the output of that as an example of the semantic order to be found in usable data. Finally, I examine page counts from pairs of concepts from fifteen concept sets. I used three search engines: Google, Yahoo and Bing. Google and Bing returned many errors. Yahoo returned none. Google and Bing show a statistically significant relationship between conjunction errors and the complexity of the concept names. Queries with two or more terms in the names produced fewer errors than ones with single term names. More complex queries, also, tended to produce smaller page counts, and smaller counts are also associated with fewer conjunction errors. However, these data are consistent with the hypothesis that complexity has an effect on error that is independent of count size. I discuss how conjunction errors can be avoided and suggest several ways to avoid them. I conclude that page counts can be used with some confidence in the social/behavioral sciences, and discuss promising avenues for further research.https://doi.org/10.4256/mio.2010.0031
spellingShingle Jack B. Arnold
The Internet and Semantic Structure: A Re-Examination
Methodological Innovations
title The Internet and Semantic Structure: A Re-Examination
title_full The Internet and Semantic Structure: A Re-Examination
title_fullStr The Internet and Semantic Structure: A Re-Examination
title_full_unstemmed The Internet and Semantic Structure: A Re-Examination
title_short The Internet and Semantic Structure: A Re-Examination
title_sort internet and semantic structure a re examination
url https://doi.org/10.4256/mio.2010.0031
work_keys_str_mv AT jackbarnold theinternetandsemanticstructureareexamination
AT jackbarnold internetandsemanticstructureareexamination