The Internet and Semantic Structure: A Re-Examination

I discuss an earlier article arguing that Google counts for paired concepts reflect the similarity of concepts and that analysis of sets of paired queries reflect semantic structure—in the sense used earlier by Osgood. I argued there that such structure could be useful as a source of ideas about att...

Full description

Bibliographic Details
Main Author:	Jack B. Arnold
Format:	Article
Language:	English
Published:	SAGE Publishing 2011-04-01
Series:	Methodological Innovations
Online Access:	https://doi.org/10.4256/mio.2010.0031

_version_	1818454630445613056
author	Jack B. Arnold
author_facet	Jack B. Arnold
author_sort	Jack B. Arnold
collection	DOAJ
description	I discuss an earlier article arguing that Google counts for paired concepts reflect the similarity of concepts and that analysis of sets of paired queries reflect semantic structure—in the sense used earlier by Osgood. I argued there that such structure could be useful as a source of ideas about attitudes and pilot data for survey research. I note that recent observations indicate that sometimes some page count sets are not logically consistent and not useful for such research. Here, I examine variables associated with one type of logical error: conjunction errors, where counts for pairs of concepts are greater than counts for one or both of the individual concepts. I review Osgood's idea of semantic space and the logic of similarity related to search engine counts. I then present examples of two count matrices—one usable and the other with a number of conjunction errors. Next, I use the “usable” matrix as input for multidimensional scaling, and the output of that as an example of the semantic order to be found in usable data. Finally, I examine page counts from pairs of concepts from fifteen concept sets. I used three search engines: Google, Yahoo and Bing. Google and Bing returned many errors. Yahoo returned none. Google and Bing show a statistically significant relationship between conjunction errors and the complexity of the concept names. Queries with two or more terms in the names produced fewer errors than ones with single term names. More complex queries, also, tended to produce smaller page counts, and smaller counts are also associated with fewer conjunction errors. However, these data are consistent with the hypothesis that complexity has an effect on error that is independent of count size. I discuss how conjunction errors can be avoided and suggest several ways to avoid them. I conclude that page counts can be used with some confidence in the social/behavioral sciences, and discuss promising avenues for further research.
first_indexed	2024-12-14T21:57:56Z
format	Article
id	doaj.art-7a91ce148ce4482eb8a08cd89d989085
institution	Directory Open Access Journal
issn	2059-7991
language	English
last_indexed	2024-12-14T21:57:56Z
publishDate	2011-04-01
publisher	SAGE Publishing
record_format	Article
series	Methodological Innovations
spelling	doaj.art-7a91ce148ce4482eb8a08cd89d9890852022-12-21T22:46:05ZengSAGE PublishingMethodological Innovations2059-79912011-04-01610.4256/mio.2010.0031The Internet and Semantic Structure: A Re-ExaminationJack B. ArnoldI discuss an earlier article arguing that Google counts for paired concepts reflect the similarity of concepts and that analysis of sets of paired queries reflect semantic structure—in the sense used earlier by Osgood. I argued there that such structure could be useful as a source of ideas about attitudes and pilot data for survey research. I note that recent observations indicate that sometimes some page count sets are not logically consistent and not useful for such research. Here, I examine variables associated with one type of logical error: conjunction errors, where counts for pairs of concepts are greater than counts for one or both of the individual concepts. I review Osgood's idea of semantic space and the logic of similarity related to search engine counts. I then present examples of two count matrices—one usable and the other with a number of conjunction errors. Next, I use the “usable” matrix as input for multidimensional scaling, and the output of that as an example of the semantic order to be found in usable data. Finally, I examine page counts from pairs of concepts from fifteen concept sets. I used three search engines: Google, Yahoo and Bing. Google and Bing returned many errors. Yahoo returned none. Google and Bing show a statistically significant relationship between conjunction errors and the complexity of the concept names. Queries with two or more terms in the names produced fewer errors than ones with single term names. More complex queries, also, tended to produce smaller page counts, and smaller counts are also associated with fewer conjunction errors. However, these data are consistent with the hypothesis that complexity has an effect on error that is independent of count size. I discuss how conjunction errors can be avoided and suggest several ways to avoid them. I conclude that page counts can be used with some confidence in the social/behavioral sciences, and discuss promising avenues for further research.https://doi.org/10.4256/mio.2010.0031
spellingShingle	Jack B. Arnold The Internet and Semantic Structure: A Re-Examination Methodological Innovations
title	The Internet and Semantic Structure: A Re-Examination
title_full	The Internet and Semantic Structure: A Re-Examination
title_fullStr	The Internet and Semantic Structure: A Re-Examination
title_full_unstemmed	The Internet and Semantic Structure: A Re-Examination
title_short	The Internet and Semantic Structure: A Re-Examination
title_sort	internet and semantic structure a re examination
url	https://doi.org/10.4256/mio.2010.0031
work_keys_str_mv	AT jackbarnold theinternetandsemanticstructureareexamination AT jackbarnold internetandsemanticstructureareexamination

The Internet and Semantic Structure: A Re-Examination

Similar Items