The Internet and Semantic Structure: A Re-Examination
I discuss an earlier article arguing that Google counts for paired concepts reflect the similarity of concepts and that analysis of sets of paired queries reflect semantic structure—in the sense used earlier by Osgood. I argued there that such structure could be useful as a source of ideas about att...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2011-04-01
|
Series: | Methodological Innovations |
Online Access: | https://doi.org/10.4256/mio.2010.0031 |
_version_ | 1818454630445613056 |
---|---|
author | Jack B. Arnold |
author_facet | Jack B. Arnold |
author_sort | Jack B. Arnold |
collection | DOAJ |
description | I discuss an earlier article arguing that Google counts for paired concepts reflect the similarity of concepts and that analysis of sets of paired queries reflect semantic structure—in the sense used earlier by Osgood. I argued there that such structure could be useful as a source of ideas about attitudes and pilot data for survey research. I note that recent observations indicate that sometimes some page count sets are not logically consistent and not useful for such research. Here, I examine variables associated with one type of logical error: conjunction errors, where counts for pairs of concepts are greater than counts for one or both of the individual concepts. I review Osgood's idea of semantic space and the logic of similarity related to search engine counts. I then present examples of two count matrices—one usable and the other with a number of conjunction errors. Next, I use the “usable” matrix as input for multidimensional scaling, and the output of that as an example of the semantic order to be found in usable data. Finally, I examine page counts from pairs of concepts from fifteen concept sets. I used three search engines: Google, Yahoo and Bing. Google and Bing returned many errors. Yahoo returned none. Google and Bing show a statistically significant relationship between conjunction errors and the complexity of the concept names. Queries with two or more terms in the names produced fewer errors than ones with single term names. More complex queries, also, tended to produce smaller page counts, and smaller counts are also associated with fewer conjunction errors. However, these data are consistent with the hypothesis that complexity has an effect on error that is independent of count size. I discuss how conjunction errors can be avoided and suggest several ways to avoid them. I conclude that page counts can be used with some confidence in the social/behavioral sciences, and discuss promising avenues for further research. |
first_indexed | 2024-12-14T21:57:56Z |
format | Article |
id | doaj.art-7a91ce148ce4482eb8a08cd89d989085 |
institution | Directory Open Access Journal |
issn | 2059-7991 |
language | English |
last_indexed | 2024-12-14T21:57:56Z |
publishDate | 2011-04-01 |
publisher | SAGE Publishing |
record_format | Article |
series | Methodological Innovations |
spelling | doaj.art-7a91ce148ce4482eb8a08cd89d9890852022-12-21T22:46:05ZengSAGE PublishingMethodological Innovations2059-79912011-04-01610.4256/mio.2010.0031The Internet and Semantic Structure: A Re-ExaminationJack B. ArnoldI discuss an earlier article arguing that Google counts for paired concepts reflect the similarity of concepts and that analysis of sets of paired queries reflect semantic structure—in the sense used earlier by Osgood. I argued there that such structure could be useful as a source of ideas about attitudes and pilot data for survey research. I note that recent observations indicate that sometimes some page count sets are not logically consistent and not useful for such research. Here, I examine variables associated with one type of logical error: conjunction errors, where counts for pairs of concepts are greater than counts for one or both of the individual concepts. I review Osgood's idea of semantic space and the logic of similarity related to search engine counts. I then present examples of two count matrices—one usable and the other with a number of conjunction errors. Next, I use the “usable” matrix as input for multidimensional scaling, and the output of that as an example of the semantic order to be found in usable data. Finally, I examine page counts from pairs of concepts from fifteen concept sets. I used three search engines: Google, Yahoo and Bing. Google and Bing returned many errors. Yahoo returned none. Google and Bing show a statistically significant relationship between conjunction errors and the complexity of the concept names. Queries with two or more terms in the names produced fewer errors than ones with single term names. More complex queries, also, tended to produce smaller page counts, and smaller counts are also associated with fewer conjunction errors. However, these data are consistent with the hypothesis that complexity has an effect on error that is independent of count size. I discuss how conjunction errors can be avoided and suggest several ways to avoid them. I conclude that page counts can be used with some confidence in the social/behavioral sciences, and discuss promising avenues for further research.https://doi.org/10.4256/mio.2010.0031 |
spellingShingle | Jack B. Arnold The Internet and Semantic Structure: A Re-Examination Methodological Innovations |
title | The Internet and Semantic Structure: A Re-Examination |
title_full | The Internet and Semantic Structure: A Re-Examination |
title_fullStr | The Internet and Semantic Structure: A Re-Examination |
title_full_unstemmed | The Internet and Semantic Structure: A Re-Examination |
title_short | The Internet and Semantic Structure: A Re-Examination |
title_sort | internet and semantic structure a re examination |
url | https://doi.org/10.4256/mio.2010.0031 |
work_keys_str_mv | AT jackbarnold theinternetandsemanticstructureareexamination AT jackbarnold internetandsemanticstructureareexamination |