First I like your summary of material on Atmospheric Tides. It covered much of the topic and is a great starting point. but the form leaves a lot of work for the researcher, particularly for topics that might have tens or hundreds of thousands of urls on the Internet that are closely connected. These “millions of results” and “billions of results” for topics on the Internet is what I have been studying the last 25 years. Things like “covid” or “global climate change” are fairly obvious. Covid with billions of fragments on the Internet. And climate change getting dragged in many different directions.
So you do give the kinds of papers and book chapters someone 20 years ago might be happy with. Now serious analysis of the whole of topics on the Internet is starting to get wrapped up with “AI assisted” and “full topic processing” of all materials on the Internet. Things like language translation, and these large language models is only part of it. Pretty much every subject, issue, topic and new opportunity on the Internet is attracting the attention of groups, “let’s get our arms around the whole thing”, and treating it more seriously to actually do that.
The cost of compiling, cleaning and indexing is dropping. Or it was for a while. The AI groups started to do it “for the good of all” but then money got involved and prices went up. So chasing those groups can be a losing proposition. But some effort toward “build a complete and understandable map of the whole of the subject or topic” is possible. Not at the brute force “compile it from raw text with no curation or pre-processing, or involvement of the subject groups themselves”.
Richard Collins, The Internet Foundation